Recently I came across a question about a common problem regarding blog post indexing in Google. The business had about 70 blog posts that they thought were each on their own page through their HubSpot blog. Yet – Their Google Webmaster Tools account only shows Google indexing 27 pages. Here is how they and you can get your blog posts indexed on Google with a few best practices.
Find out if Your Blog Posts are Indexing
In order to do this, it is wise to start by getting an idea of how Google crawlers are viewing your website and blog. You can do this using what is a called a ‘search operator’ which is basically a command you can type in Google that will serve only certain results. In this case, you want to use the ‘site:’ search operator to see what pages Google is indexing on your domain.
Try a search in Google like this:
You can see in the screenshot here or if you tried it for yourself how Google “sees” your website. In the particular case mentioned at the start of this post when I did a search like this for them the search engine results page was not pretty. Bunch of blog category, tag, and author archive pages showed up on the first page under the homepage. No bueno. Out of the box by default popular websites platforms like HubSpot and WordPress have this inherent issue which is easily resolved if you know the best practices and proper use of the no-index meta tag. So in the example business, I am using their indexed archive pages were being given more priority by Google than his individual blog post pages that were on the 9th or 10th page of Google results when I ran that search operator command to see. What was going on is that those tag and author archive pages, when indexed, are very rich in content and google is crawling those more often and more completely. It’s also giving them precedence in the SERPs.
What do those results look like compared to Google Webmaster Tools?
If you instead go into Google Webmaster Tools and look you can get similar data but not in the visual way you get it with the search command referenced above. Never-the-less you should do both if you are troubleshooting or monitoring your website’s indexing status.
You can understand how the search command above paints a much better picture than Google Webmaster Tools in terms of how Google is prioritizing which pages it indexes on your website and blog.
Don’t get me wrong GWT can be extremely helpful in diagnosing crawler or indexing issues with Google. You should use both tools and make sure that all actionable items are addressed so that your website and blog pages are all index as you’d ideally want them.
Why Google Hates Your Blog Post Pages
Think about it this way: Maybe you only have 1 tag and or 1 author that you put on every post. When Google indexes that tag/author archive page it’s going to have the exact same content on it as the author’s archives page or the blog roll.
- Blog Category Archive Pages are Indexing – Consider setting those to no-index.
- ‘Tag’ and ‘Author’ Archive Pages are Indexing – These are meant to be used for user navigation not so much for people searching Google. What is happening is that tag and author archive pages, when indexed, are very rich in content and Google is crawling those more often and more completely. It’s also giving them precedence in the SERPs.
- Duplicate Content – The above two items can result in duplicate content on your own site. Duplicate content is a topic of its own and worth mentioning here but not going into the details of that now.
Recommended WordPress Yoast SEO Settings for Blog posts Indexing on Google:
I create a 4-minute video showing me setting the default setting for WordPress SEO by Yoast to avoid these issues in WordPress:
Note: This video is not the solution for everyone but meant as general guidelines for avoiding the types of issues mentioned in the rest of this post. Just remember that your individual situation may be unique and consulting with an SEO expert is the best way to go.
How to get Google to Rank Your Blog Posts
So how does Google know which to rank or crawl more often and which does it give less priority too since both pages contain the same content? Here are some tips on how to get the correct pages ranked and crawled more often.
- Google Webmaster Tools Sitemaps – You should always submit a sitemap to Google Webmaster Tools.
Meaning: Submit a sitemap and use the above-mentioned tools to correct errors.
- Social Signals for Blog Posts – Believe it or not Google does rely heavily on social signals for indexing and ranking content. This was a mystery for a bit but I think the overall consensus now is that social signals are a big part of helping Google crawlers do their job better and quickly index content.
Meaning: How many likes, tweets, and +1’s do your posts have?
- Internal Linking – Do you have contextual internal links on the indexed posts and pages of the site that link to the non-indexed blog posts? Internal links help Google crawlers crawl your entire site and crawl linked posts and pages more often.
Meaning: Do you have procedures to utilize internal linking properly on your content?
- Category or Tag Archive Pages Indexing – Sometimes those will take precedence over the individual blog post pages. Google has a tendency to prefer to rank the category page higher than individual posts contained within that category.
Meaning: It is most often wise to set no-index, follow on category and tag archive pages as described in the video above.
Additional Resources for Hubspot
If you are using Hubspot, here is a resource to you make necessary corrections to your settings;
The post above post will walk you through how, when using Hubspot, you might want to remove certain web pages from the SERPs (search engine results pages), and how to go about doing it. It does not dive specifically into how to set archive pages to no-index but that is something HubSpot’s great support team can more than likely help you with.
The Bottom Line
Following these steps will help you ensure that your blog categories, tags and author archive pages are not taking authority over your individual posts. Avoiding a conflict for Google when trying to determine which pages to rank and crawl more often. Following the above steps will allow your blog post pages to be index better and rank higher while avoiding duplicate content issues on your own site.
I explained this to the example business I mentioned at the start of this post and here is his response:
Raleigh, Wow. thanks very much for your insights. Also, breaking down the concepts into a metaphor I can understand. I’ll have to review this with my Website guy, and if we have any other questions as we journey through this, I’ll get back to you. Again, thanks for sharing your expertise. I owe you a coffee sometime ; )
Since he sent that he has told me that he checked everything and found there is “lots to do” so it seems he is on the right path to Google indexing each one of his blog post pages with the priority it deserves.
Want to read more? Check out my other Google Indexing post: How to Troubleshoot Google Indexing Issues
Need more help?
Please feel free to seek SEO expert help over at Codeable