What Are ‘Soft’ 404 Errors and how to fix them?
You are probably aware of the 404 error on the website, which is displayed when you try to open a webpage that hasn’t been found. But using this error message in the wrong context can be harmful to your website’s organic traffic.
Sometimes, marketers ignore these kinds of technical errors and expect that these errors will be self-handled by the developers. But every aspect like this should be taken care of deeply and the marketing team should work more closely with the developers. Search Engines nowadays are more into user experience and every small error related to UX in your website can be harmful in the SERPs.
Let’s read everything important about Soft 404 errors, how detrimental are these for the organic results of your website, and how soft 404 errors can be resolved.
What is a soft 404 Error?
A soft 404 error occurs when a website’s server returns an HTTP 404 standard response code to point out that the webpage, the user is trying to visit, does not exist. The information is sent to the browser and the search engines that the page doesn’t exist.
The thing you need to understand is that the content of the page – ‘page not found’ message – is completely different from the HTTP response returned by the server. When a page doesn’t have content or shows the “Page not found” error, it doesn’t mean that it’s a 404 page.
In Google’s language: “This is like a giraffe wearing a name tag that says ‘dog’. Just because the name tag says it’s a dog, doesn’t mean it’s actually a dog. Similarly, just because a page says 404, doesn’t mean it’s returning a 404 status code.”
A Soft 404 takes place when the page you are looking to open, does not exist or has been removed and shows “page not found” error but fails to return HTTP 404 status code. It can also occur when a not existing page takes the user to an unknown and irrelevant page after redirection.
You should always remember that a webpage’s content is completely different from HTTP response returned by the server. This distinction is important because depends on how search engines treat the page. To decide the position of your website on SERPs, the search engine bots follow a process of crawling and indexing your website pages. When a 404 error is returned, bots do not waste their time to crawl and index the pages but in the case of soft 404 errors, the pages are crawled and indexed.
The problem caused by soft 404 errors
If your webpage returns an HTTP status code other than a 404 (or 410) for a non-existent page, it can result in a loss in your website ranking on search engines. If your website has a high amount of soft 404 errors, you can face a big loss in your organic traffic. Because, rather than giving 404 error to search engines, your webpage is indicating to search engine crawlers that this page is real and can be accessed by the users. Eventually, the page will be crawled and indexed without content and you will be wasting the crawl budget.
Crawl budget is defined as the number of pages crawled by Google bots on a website within a particular timeframe.
Google bots do not spend endless time on a single website. So you would definitely like them to crawl the main and all important pages on your website before they move to another website. That is why it makes sense for them to assign a ‘budget’ to their web crawls.
If the website has a high number of soft 404 errors, the crawl budget will be wasted on the website by crawling and indexing the non-existing pages. As the google bots will be spending time on non-existing pages, there is a huge possibility that your important and unique URL’s will not be crawled and indexed as frequently as required. Eventually, your website’s SEO will be affected because of the crawler wasting their time on removed and non-existent pages.
How to Resolve
Google Search Console allows you to export 1000 URLs. If you have less than 1000 URLs with errors, you can easily export and start looking into it. Once you download the file, you can start identifying the issues on webpages. Google gives you a small detail about the problems which can lead you to the origin of the problem.
In most cases, you can find the website serving a 200 status code for the pages that reflects “page not found” message. So, the first thing you should do is, get a selection of the pages with soft 404 errors and get them through the HTTP status code checker like httpstatus.io. It will show you the status code these pages are returning.
Just like an example below, a page was displaying a soft 404 error when users trying to access it. When it was checked in HTTP code checker, it returned an HTTP 200 response. This is one of the prime examples of soft 404 errors. An HTTP code tells search engines that the webpage is ready to get crawled, but the page does not have content.
Inappropriate 301/302 redirects might be encountered when you try to diagnose the root cause of soft 404 errors.
301 redirects are used when you want to redirect the user to a new and relevant page from delete pages. 302 redirects are the same but are used only when the page is deleted temporarily.
Sometimes webmasters choose to redirect the users to the homepage from deleted pages instead of serving 404 errors. That is completely wrong because it confuses the users and search engines both. Serving 404 errors is not a bad thing. Deleted pages should always redirect to the permanent replacement. If you do not have a permanent replacement, a custom 404 error should be served to display alternative options.