What triggers this issue?
This issue reports canonicalized URLs where canonical links point to URLs that return one of the 4xx (Client Error) HTTP status code.
<link rel="canonical" href="https://ahrefs.com/help/en/"/>
https://ahrefs.com/help/en/ declared as canonical, returns any of the 4xx HTTP status codes (Client Error). The most common 4xx error is 404 (Not found).
What is a canonical link?
Just in case you don't know; canonical links are used to solve duplicate content issues. If you have several pages with the same or similar content, you need to pick the one that you want to rank.
And by pointing canonical links to that page from its copies, you explicitly tell search engines that this is the page they should index (and hopefully rank) instead of the others.
A common use case of canonical link is, for example, product variants in eCommerce shops.
Here's a quick example of how it can look like:
<link rel="canonical" href="http://example.com/">
If you want to find out more about them, feel free to check Google's guide.
What triggers this issue?
This issue indicates that there are URLs specified as canonicals on the pages of your website that return one of the 4xx HTTP status codes. Which basically means that the page isn't accessible.
Why is it important?
If the page is not accessible for search engines, they will be unable to index it and it won't show up on the search results page.
There are various types of 4XX codes, you can check their description here:
How to fix it?
Some of the issues can be easily solved. Some are trickier, and qualified assistance is highly advised here.
But here's a brief overview of the 4XX HTTP issues you are likely to deal with.
400 - Bad request
This error stands for communication issues between the server and your browser. Basically, the server failed to understand the request your browser is sending.
This type of HTTP code can be caused by errors in the URL, the syntax of it. You might want to check the URL in the rel=canonical for non-allowed symbols, like a percentage character, etc.
Here’s a list of unsafe URL characters.
401 - Unauthorized
It’s a permission issue that indicates the page is accessible only for logged in users. As you know, canonical links are meant to rank. Seeing that that page is publicly unavailable and you still want it to be so, you should either remove the canonical link to it or find the page that better suits this purpose.
403 - Forbidden
It has to do with permissions as well and means that the content is blocked for a specific user group.
You can grant free access to it via your server or remove/replace the link.
404 - Not found error
Probably the most common 4XX HTTP status code out there. The page was either deleted or its URL has changed. Possible ways to fix it:
- make sure the URL in rel=canonical is the correct URL of the canonical page. It might have a typo.
- if the canonical page is gone, find or create a new one and set it as canonical; refrain from redirecting the old page to the new one as it will result in 'canonical points to 3xx' issue.
- if you don’t have any substitution for the missing page, you can change the canonical tag to self-referencing
- What is Health Score and how is it calculated in Ahrefs Site Audit?
- What is Site Audit's Page Rating and how do we calculate it?
- I get an error message when setting up a crawl in Site Audit. How do I fix this?