Many marketers and website owners try hard to get their pages ranked on Google and Bing. To be ranked on search engines such as Google, your pages must be indexed. When a page is indexed, it is then available to be found on the search engines. This sounds great right? but what if I told you that this could actually be HURTING your SEO efforts?
HOW DOES SEARCH ENGINE INDEXING WORK?
In order for a website to be indexed, it must first be crawled. Crawling refers to the process in which a search engine bot takes data from your website. This data would be in the form of code, links and content. A site crawler, which is always known as a web spider, is able to crawl your website and any links on your pages and interpret your data. Crawling is generally an automated process which is carried out by bots (spiders).
Search engines such as Google will use bots/spiders to crawl your website and also follow links that are on your site. The spiders jump from link to link to create a map of the web. The purpose of a web spider is to ensure that the information is the most up to date and relevant. A website can only be indexed in the search engines on the basis that it has been crawled. This is because the crawling process retrieves the information required to index.
WHY IS A SITE MAP USEFUL?
A site map is an important part of a website and is often overlooked. It provides guidance for a spider to crawl your pages. A typical site map will include every link on your website to help the spider navigate through your pages. This can help Google bots/spiders to fetch data more accurately. If your website runs on a CMS such as WordPress, you can install plugins that are able to generate a Sitemap for you. There are many Sitemap generators available on the internet. Typically, once you are have generated a site map, you would need to tell the search engines where this file is located. This can be done via Google using webmasters tools.
Google will index your pages unless you tell them otherwise. When you tell Google not to index your pages, the process is called deindexing. You may think this is counterproductive to your search efforts, however there are reasons as to why deindexing can actually boost your traffic.
INCREASING YOUR TRAFFIC BY REMOVING PAGES
There are occasions where you would want to exclude pages from being indexed in the search engines. A good reason to do this is to prevent duplicate content. If you have a lot of pages that are duplicated, there is no reason as to why you should be indexing all of them.
Another obvious reason to deindex pages is when it does not make sense for a user to land on that page. For example, a user may be redirected to a “thank you page” when they fill in a form on your website. There would be no reason to index this page on search engines as it does not make sense to serve the page to the user. Search engines such as Google love pushing websites that are user friendly. By directing landing visitors to a “thank you page”, you will cause confusion and therefore not be user friendly. This could negatively impact your search rankings.
Did you know that over 75% of Moz’s website was deindexed? This ended up becoming a huge success. Britney Muller from Moz discovered previously that over 56% of indexed pages were in fact community profile pages. She made the decision to deindex inactive or spammy profile pages, which ended up having a positive effect on traffic. Organic traffic and rankings went up whilst the number of pages indexed went down.
FLOW YOUR LINK JUICE CORRECTLY
The more indexed links on a website, the more link juice is distributed. Link juice is a term used to describe power between pages. The more powerful a page, the more likely it will rank well. The power of that page is determined by many external and internal factors.
A “powerful” page with a lot of links will distribute that power (or juice) to the links on the page. We don’t want to be giving away page power (juice) to pages that are not important. For example, if you had a privacy policy on the footer of a page, the search engines will attempt to distribute this power into the privacy policy page. There is no reason as to why we would want to do this. The privacy policy page is not a page that you would want to perform well in the search rankings in comparison to important pages on your website. A vital part of SEO is ensuring that your links flow well and make logical sense. Fortunately, we can control this by editing our robots.txt file and using nofollow and noindex tags.
NOINDEX, NOFOLLOW AND ROBOTS.TXT EXPLAINED
In order to remove pages from the search engines, we can use noindex and nofollow tags. A noindex tag tells the search engines that they can crawl the page for data but requests them not to index the page. A nofollow tag disallows search engines from being able to crawl the links. A nofollow tag can be applied to the whole page or individual on page links, which helps us to distribute link juice in a logical way.
You would add both nofollow and noindex tags to a page when you do not want that page indexed at all or the links followed. This is what we would do to “thank you” pages. It is recommended that you hire an SEO company before attempting any SEO work.
Robots.txt is a file that can be added to your website directory to tell search engines how you would like your website to be crawled. From here you can deindex specific folders or URLs as well as setting crawl delays or disallowing certain search engines entirely. If you noindex a page, you need to make sure that it is not included in your sitemap, as this will cause confusion for the search engines.
You can use your sitemap, noindex tags, nofollow tags and robots.txt file to accurately control and deindex pages that are not required. Deindexing your pages can help your search rankings by creating order and structure therefore allowing you to serve the content that matters most to your audience. The more relevant your website is to your audience, the more likely the search engines will push your website.