To create a new project in Site Audit, you need to set the crawl scope first. The scope defines the boundaries of which you want us to crawl your site.

You can use one of the three modes:

Path - to crawl any subfolder on a website

Domain - to crawl a specified domain excluding its subdomains and

Subdomains - for the most complete crawl.

Besides, you can instruct our tool to crawl only URLs that use a given protocol, such as HTTP or HTTPs:

The default http+https option includes both protocols.

While the protocol selector is pretty self-explanatory, modes might need some specific examples for better understanding.

Path (domain.tld/path/*)

The Path mode includes every URL that begins with the exact path that you enter into the “Domain or path” field.

It allows you to limit the crawl to a given folder or subfolder of your website.

Example:

This configuration will include all website URLs that begin with ahrefs.com/blog, such as:

  • ahrefs.com/blog itself
  • ahrefs.com/blog/who-links-to-my-site
  • ahrefs.com/blog/who-links-to-my-site/?utm_source=facebook
  • and any other URL in the /blog/ folder

However, www.ahrefs.com/blog will be excluded because it lies outside the defined path.

Domain (domain.tld/*) or (subdomain.domain.tld/*) 

The Domain mode takes the domain (or subdomain) from the path that you enter into the “Domain or path” field and includes all its URLs into the crawl.

It allows you to exclude URLs on subdomains (or sub-subdomains).

Example 1:

This configuration will include all URLs that begin with ahrefs.com, such as:

  • ahrefs.com itself
  • ahrefs.com/blog
  • ahrefs.com/blog/who-links-to-my-site/

However, www.ahrefs.com/blog will be excluded because it lies outside the defined domain.

Example 2:

This configuration will include all website URLs that begin with help.ahrefs.com, such as:

  • help.ahrefs.com itself
  • help.ahrefs.com/site-audit
  • help.ahrefs.com/site-audit?utm_source=facebook

However, ahrefs.com or ahrefs.com/blog/who-links-to-my-site/ will be excluded because they do not belong to the defined subdomain.

Subdomains (*/domain.tld/*) or (*subdomain.domain.tld/*) 

The Subdomains will grant the most complete crawl of your site.

The Domain mode uses the domain (or subdomain) from the path that you enter into the “Domain or path” field and includes all URLs on it and on all its subdomains (or sub-subdomains) into the crawl.

Example 1:

This configuration will include all URLs on ahrefs.com and on all its subdomains like:

Example 2:

This configuration will include all URLs on help.ahrefs.com and on all its sub-subdomains like:

  • help.ahrefs.com
  • help.ahrefs.com/site-audit
  • staging.help.ahrefs.com/site-audit

If you still have any questions about the scope of the crawl in Site Audit tool, just use our support chat. We’ll be glad to help you out ;) 

Did this answer your question?