Duplicate pages are pages with the same Title/H1/Similar Content. But in order for us to see the relation between the two pages, there are some tags that make this possible; they are:
Hreflang - This means the pages are a translation of each other.
Pagination - This means one page is continuing from another.
Canonical - This means both pages are the same, but one is the official version.
In the event that two pages do not have any of these tags to show the relation between them, then they are considered bad duplicates. In order to make them good duplicates, usually using a canonical to show the relation between them is used.
Ahrefs is quite advanced when it comes to grouping good duplicates -
- Two pages are part of the same group if they have the same canonical (no canonical is considered self-canonical).
- Two pages are part of the same group if they have the same hreflang.
- Two pages are part of the same group if they have the same pagination.
So in essence, take for example - if page "A" has the same canonical as page "B", and page "C" has a different canonical (but same hreflang as page "A"); then pages "A", "B" and "C" are part of the same group.
Upon completion of your site audit, in the Overview you will notice a horizontal bar graph portraying "HTML tags & content" (as pictured in the image below).
You may wonder what are "good" and "bad" duplicates; and ask yourself, "aren't all duplicates bad?"
Good Duplicates - These are pages that have the same content, but are made clear by using canonical, pagination, or hreflang tags.
Bad Duplicates - These are pages that are duplicates and do not have canonical or hreflang or pagination tags to show that they are the same content; OR they do have these tags (canonical or hreflang or pagination) but to different canonical versions - making them duplicates still.