Kevin Muldoon

Giving You the Tools to Make Money Online

  • Home
    • Detailed Guides & Tutorials
    • Reviews
    • Internet Marketing
    • Archives
  • YouTube
  • WordPress
  • Hosting
    • Shared Hosting
    • VPS Hosting
    • Dedicated Hosting
    • Cloud Hosting
    • Managed WordPress Hosting
    • Web Hosting Deals
  • Webmaster Resources
  • Contact
    • Services
    • Advertising
    • Support
  • About
    • Books
    • Subscribe
    • Amazon Reviews
Home » Short Guides and Tutorials » How to Fix Crawl Errors in Webmaster Tools

How to Fix Crawl Errors in Webmaster Tools

August 2, 2013 By Kevin Muldoon 21 Comments


  • Facebook
  • Twitter
  • Google+
  • LinkedIn
  • Reddit

Since relaunching my website Martial Arts Videos two months ago, I have been faced with a difficult problem. The website gets little to no traffic from Google. Some days I am only getting one or two visits from Google; and traffic from Google rarely exceeds ten.

A quick search on Google shows that most of my recent articles are not indexed. Some of older articles are listed, however they seem to be listed on the second page of Google’s search results. Annoyingly, the corresponding update from the Martial Arts Videos Facebook page is listed first.

Martial Arts Videos Search Engine Traffic

I have never used any black-hat techniques to promote the website (or any website for that matter) and Google confirmed that it did not have any penalties. This led me to believe that my low search engine rankings were being caused by too many low quality links. In particular, Martial Arts Videos was linked in the sidebar on a related website I own (MMAClips) that had hundreds of thousands of pages indexed in Google. The result was that I had over 600,000 incoming links from a low quality website.

I discussed this issue in-depth in my article “How To Stop Bad Incoming Links Hurting Your Search Engine Rankings“. As you may recall, I used the Google Disavow Tool to request that Google removes all links from MMAClips.

The result has been a little surprising. I had assumed that disavowing a full domain would remove all links at once. It doesn’t. Instead, I have seen the number of incoming links from MMAClips slowly decrease every week. Currently, 418,133 incoming links remain.

Disavowed MMAClips

With hundreds of thousands of links still pointing at MartialArtsVideos.com, there is a chance that my ranking is still being affected by the number of incoming links. There is also a chance that it is being caused by something else.

How to Fix Crawl Errors

I initially developed Martial Arts Videos in the first half of 2012. The site used an automated script to publish YouTube videos about selected martial arts topics. When the website was relaunched in June 2013, I reduced the total number of posts from 10,300 to only 18.

As you would expect, I saw a large increase in 404 errors as a result of this (also known as not found errors). When I checked Google Webmaster Tools last night, I had a total of 17,493 not found errors.

Martial Arts Videos Crawl Errors

Google allows you to download a list of your 404 error pages in CSV format or via Google Docs.

Download Crawl Errors CSV

I was surprised to see that only around 2,200 URL’s were listed in the CSV file; despite there being around 17,500 listed on Google Webmaster Tools. Perhaps even stranger was that the list included 404 error codes and 418 error codes. The 418 error code apparently stands for “I’m a teapot”.

It is not clear if this reduced list of error URL’s signifies that the list is incomplete, or if the total figure quoted in Webmaster Tools is incorrect.

View Your Crawl Error List

If you search the internet on how you should handle 404 error pages, you will see two conflicting pieces of advice. One group advise that you 301 redirect all of 404 error pages to your home page. For those of you who are not familiar with HTTP status codes, a 301 code means that a page or website has been permanently moved somewhere else. It is commonly used when your permalink structure is changed, however I have seen many websites redirect their 404 errors directly to their home page using a 301 redirect.

The other group advises the complete opposite. They state that your 404 not found error pages should remain as they are. Alternatively, you can use the status code 410 to advise that a page will be gone forever. Google claim that 404 errors do not hurt a website. Google also state that they handle 404 and 410 errors in the exact same way, therefore it seems pointless to set up lots of 410 errors for pages that are gone if you can simply let your website generate 404 errors automatically. Afterall, in the eyes of Google, they are one and the same.

If you’re getting rid of that content entirely and don’t have anything on your site that would fill the same user need, then the old URL should return a 404 or 410. Currently Google treats 410s (Gone) the same as 404s (Not found).

So what should you do: Should you redirect your 404 errors or just leave them alone and let the search engines do their thing? Most respected SEO authorities, and Google themselves, advise not redirecting 404 pages to your home page. Apparently, this can actually hurt your rankings as a not found error message could be passed onto your home page; however, with most large travel websites using this tactic, I am not sure this is actually true in practice.

If your pages still exist in another location, you should redirect the content to the correct URL. You can do this easily by adding a 301 redirection request to your .htaccess file. All you need to do is enter “Redirect 301”, followed by your old address and new address. For example:

Redirect 301 /article-about-water.html http://www.yourwebsite.com/water.html

A 404 error page seems to be the best option if the page has been deleted permanently. That is, after all, what the code was created for. I can see why some people would want to redirect their error pages to their home page, however if you have an informative 404 page that directs users to good content, it should not be a major issue. WordPress handles 404 errors natively, though it is prudent to check that Google is getting the right response from your error pages. A template that states 404 does not necessarily mean that it is sending the right status code back to Google.

You can check this easily using the “Fetch as Google” tool within the crawl section of Webmaster Tools. That will show you exactly what Google sees when it visits your URL and advises you the status it receives.

Fetch as Google

I have shown that, according to Google and respected SEO companies, you should simply leave your deleted pages as 404s and let Google work out everything themselves. I realise this is what you are supposed to do, however I am not 100% sure whether or not it is definitely the best thing to do in practice.

Bizarre Behaviour From Google

From what I understand, and admittedly, I am not an expert on this subject; Google will check a 404 URL again and again to see if the error has been corrected (i.e. the page content has returned or the URL has been 301’d to the correct place). So I initially thought that a 410 status code would be the best solution for me as my content was not going to return. Yet Google advises that they handle 404 and 410 status codes in the same way.

I therefore assumed that after removing around 10,000 posts, Google would report an increase in 404 errors and then I would see them disappear over the following weeks. That is not what has actually occurred. Google has actually been reporting more and more not found errors every single day. The screenshot that I published earlier in this article was taken last night and shows that I had 17,493 not found errors. Less than 12 hours later, Google increased the number of not found errors by 64.

Martial Arts Videos Crawl Errors

This increase in not found errors is baffling. The sitemap for Martial Arts Videos lists 73 URLS (i.e. 73 unique pages of content). The URLs that are being listed as not found are not listed on the website anywhere, nor are they linked anywhere. This makes it all the more confusing that Google is increasing the number of not found errors on the site.

If you click on any of the URLs that are listed as not found, you will see how Google found the link. It explains the date the URL was first detected and the last crawled date.

Not Found Details

None of the URLs are currently linked in my active sitemap.

Not Linked from Sitemap

What I find strange is that Google is finding links from pages that were removed more than six weeks before. One of the links is from a sitemap that does not even exist.

Linked From URLs

As I write this, Martial Arts Videos has:

  • 10 Server Errors – These are reported as 500 codes, which means refers to internal server errors.
  • 17,558 Not Found Errors – These are 404 error codes.
  • 1,834 Other Errors – All other errors had a 418 error code. As I mentioned before, this apparently means “I’m a teapot”!!

My aim is to remove all of these errors. Once they have been removed, I hope to see a jump in my search engine traffic.

On the information page for soft 404 errors, Google notes:

Returning a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404) can be problematic. Firstly, it tells search engines that there’s a real page at that URL. As a result, that URL may be crawled and its content indexed. Because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently and your site’s crawl coverage may be impacted (also, you probably don’t want your site to rank well for the search query [File not found]).

I have read a few comments from people on Google Groups that also believe the part denoted above in bold, is true. That is, that when Google’s spiders spend too long trying to index non-existent pages, your crawl rate for the pages you do want indexed is affected.

Perhaps that is what is occurring with Martial Arts Videos. Could it be that Google is spending so much time trying to index pages that were removed from the website two months ago, that it is not indexing the pages that should be indexed? Who knows. What Google says and what Google are not always the same.

Putting that aside, how do I remove my old URLs from Google’s index. One option appears to be the “Remove URLs” under the “Google Index” section of Webmaster Tools.

Remove URL Through Webmaster Tools

Unfortunately, you can only remove one URL at a time.

Enter Your URL

During the removal process, Google advises that for permanent removal, the content has to be blocked using the robots.txt file. This suggests that removing a URL from Google’s index does not guarantee that the page will no longer generate a 404 error.

Confirm URL Removal Request

Page removal is not instant. When you request a page to be removed from Google’s search results, the request will be placed in a pending status. I only made this request today, therefore I am unsure how long this process takes (I suspect it is at least a few days).

URL Pending Deletion

Removing 10,000 URLs from Google’s index in this manner is not practical. Further inspection led me back to the Do 404s hurt my site? on Google. It confirms that removal requests are not necessary if a URL already returns a 404 error message:

Q: Can I use Google’s URL removal tool to make 404 errors disappear from my account faster?
A:
No; the URL removal tool removes URLs from Google’s search results, not from your Webmaster Tools account. It’s designed for urgent removal requests only, and using it isn’t necessary when a URL already returns a 404, as such a URL will drop out of our search results naturally over time. See the bottom half of this blog post for more details on what the URL removal tool can and can’t do for you.

Many website owners state that the best thing is to just wait for Google to correct everything. Some people are reporting this takes three months, others are saying that errors are still there after nine to twelve months.

One of the best responses I have read was published on Stack Exchange. It said:

Webmaster Tools is notoriously slow at updating the links/errors page. In particular, even when a page is no longer linked to, Google’s bot keeps requesting the page and reporting that it cannot be found.

If any of the URLs follow a common pattern you can do a 301 redirect to the correct page, which should speed up Google’s removal of those errors. (Note: I wouldn’t recommend adding thousands of lines to htaccess because that can seriously impact performance.)

Aside from that there isn’t much you can do unfortunately besides wait it out. If there are definitely no links pointing to the non-existent pages then the Crawl Errors section will slowly shrink over time. It can take up to 3 months in my experience.

An Overview of Fixing Crawl Errors

In theory, fixing crawl errors is easy. You simply need to 301 live pages to their current location and let deleted pages go to a 404 error page.

In practice, that does not seem to be happening….at least not for Martial Arts Videos. Webmaster Tools is reporting more errors for the website every day. This is a bizarre occurrence when you consider all of these articles were removed closed to two months ago.

It seems I only have two options:

  • 301 all 404 error pages to the home page.
  • Leave everything as it is and let Google resolve everything in their own time.

My dilemma is this: Can I afford to wait for Google to resolve this issue? I am paying writers hundreds of dollars every month to write for Martial Arts Videos and the website will not be profitable until traffic from Google gets to the level it should be at. Many website owners have stated that it can take a year or longer for Google to remove all errors. I really cannot afford to wait a year for this to be addressed.

The problem is that I do not even know if this is the cause of my low search engine traffic. It could be caused by something else. If you recall, Google states that 404s do not hurt my site, so perhaps my traffic will return when the number of low quality incoming links pointing to the site are gone.

I would love to tell you all what is the best thing to do in this situation. The truth is, I am far from an expert on this issue. Therefore, for the time being, I will listen to the advice of people a lot smarter than me and leave my website as it is.

Hopefully, Google will not take months to resolve the issue. If you know of a better way to handle this problem, please leave a comment and let me know what needs to be done.

Thanks for reading 🙂

Kevin

  • Facebook
  • Twitter
  • Google+
  • LinkedIn
  • Reddit

Related

About Kevin Muldoon

My name is Kevin and this is my blog :) I am an experienced blogger who has been working online actively since 2000. Through this blog I talk about internet marketing, technology and travelling. You can get updates to this blog by subscribing via RSS or Email. Alternatively, you can follow me on Google+, Facebook or Twitter.

Comments

  1. Kris says

    August 2, 2013 at

    Hi Kevin,
    I was also trying to fix those error a while ago. For 87studios I have around two thousand 404 errors and year ago, when I was relaunching site it was around six thousand. So choosing option number two is the easier and in my opinion better way to get it done. It’s not worth to waste time on it. But what is important in this case, a good 404 page design that will attract even a visitor who was looking for something else.

    Kris

    Reply
    • Kevin Muldoon says

      August 2, 2013 at

      Hi Kris,

      How were these 404 errors created? That is, did you change your permalink structure or delete articles?

      Kevin

      Reply
      • Kris says

        August 3, 2013 at

        They were created like in your case, by deleting thousands of posts. Before relaunch, 87studios was a wordpress news feed for me, with more than 6500 posts.
        Right after the change I was trying to redirect all error links with 301, but later decided to leave it and create better design for not found page.

        Reply
        • Kevin Muldoon says

          August 3, 2013 at

          It sounds like our situation is almost identical. It’s a little concerning that you are still seeing these errors after all this time.

          Perhaps the most relevant question is: Is your search engine traffic affected by this?

          Kevin

          Reply
          • Kris says

            August 3, 2013 at

            It’s even more strange, that I sometimes see more errors and sometimes less. It’s changing few times a week.
            And no, it’s not affecting on search engine traffic at all. I have biggest impact from search results. You can see charts in my income report.

            Reply
            • Kevin Muldoon says

              August 3, 2013 at

              That is good to hear.

              The information I found on the internet was a little contradictory. Some people stated that it could affect search engine traffic. Others said it would know. It is difficult to know who to believe so the only thing I could do is wait it out and see what happened with me.

              Reply
  2. Kris says

    August 3, 2013 at

    As it is not affecting traffic, it’s also safe to wait for changes. But the most curious thing is, that the number of errors is changing and google’s report says that my sitemap has those 404 liks inside (which is not true). I guess it also depends of crawling frequency of sites, that have backlinks to those missing posts and pages.

    Reply
  3. venki says

    October 10, 2013 at

    Thank’s for sharing,but that 404 Errors not deleted

    Reply
    • Kevin Muldoon says

      October 10, 2013 at

      Google take a while to update that in webmaster tools.

      Reply
  4. [email protected] says

    February 19, 2014 at

    What happened. Has it reduced over time\?

    Reply
    • Kevin Muldoon says

      February 19, 2014 at

      It has reduced a little but not the problem still remains.

      Reply
  5. t20iplindia says

    September 21, 2014 at

    This is what i was searching

    But i am getting 404 error with some randome generated url with my links

    how to solve that

    Reply
    • Kevin Muldoon says

      September 22, 2014 at

      Please drop by Rise Forums and explain the problem in more detail and we can take a look at it 🙂

      Reply
  6. Vivek Singh says

    August 24, 2015 at

    Recently My Website 404 error is increasing day by day in Google Webmaster not only this I’m also losing my visitor….Do You have Any Suggestion Regarding this

    Reply
    • Kevin Muldoon says

      August 24, 2015 at

      Can you drop by Rise Forums and explain the situation in more detail. We will do our best to help there 🙂

      Reply
  7. diwali quotes says

    October 19, 2015 at

    Thanks a lot for the blog post.Really thank you! Will read on…

    Reply
  8. Happy diwali wishes for friends says

    October 22, 2015 at

    njoyed every bit of your post.Really thank you! Much obliged.

    Reply
  9. hemany chaudhary says

    April 29, 2016 at

    when i look on my webmaster tool there is a lots of crawl error url what to do now

    Reply
    • Kevin Muldoon says

      April 29, 2016 at

      Drop by Rise Forums and tell us more about the issue and we will help resolve the issue.

      Reply
  10. A says

    December 9, 2016 at

    Hello Kevin,

    Google Crawl is on buget – that means that google alocated some time to spend on your website crawling.

    You can do this things:

    1. Go to Google Search Console/Crawl/Crawl Errors and Mark all links as fixed – if you dont have a redirect on those links google will erase them.

    2. Use a xml sitemap for your new website so that you can tell Google about them.

    3. If you tell Google that you’ve fixed those links – they will stop reporting to you about them – but they will tell you about others.

    Reply
  11. Shailesh Chaudhary says

    May 21, 2017 at

    Hi Kevin,
    Awesome article… It is Very helpful for me.. Thanks Dude

    Reply

Leave a Reply Cancel reply

Kevin Muldoon
I am an experienced blogger and internet marketer who loves working with WordPress. I make money on the internet through blogs, content websites and forums.Read More

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4,295 other subscribers

Categories

Featured Articles

  • Eleven Reasons Affiliate Marketing Sucks
  • Case Study: The Relaunch of Martial Arts Videos
  • The World’s Most Beautiful Beaches
  • 21 Versatile Non-Profit WordPress Themes
  • 40 Best WooCommerce WordPress Themes for Ecommerce
  • Long Tail Pro – The Perfect Way to Target Long Tail Keywords
  • 10 WordPress Plugins to Optimise Your Website Images
  • 10 Ways to Create YouTube Video Intros & Outros
  • 42 Best WooCommerce Plugins to Expand Beyond the Basic eCommerce Shop
  • Landing Page Thumb26 Landing Page Templates to Make Your Sales Explode

Recent Posts

  • Check for Plagiarised Content with PlagiarismCheck.org
  • Hosting24 – Shared Hosting & VPS Hosting at a Great Price
  • 5 WordPress Plugins That Can Help You Increase Your Startup Sales
  • Cryptocurrency Trading Isn’t for Everyone
  • Small YouTube Channels are Now Demonetised
  • The Best Cryptocurrency Investment Strategy
  • The Media are the Puppets of Banks and Corporations

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4,295 other subscribers

Rise Forums - Internet Marketing ForumsAre you looking for help with your website?

Do you want to learn more about SEO, website development, and making money online?

If so, I encourage you to join us on Rise Forums and become a part of the fastest growing internet marketing community on the internet.

Copyright © 2018 Kevin Muldoon