All Collections
Site Audit
Tutorials
How to define the scope of a crawl in Site Audit using different modes?
How to define the scope of a crawl in Site Audit using different modes?

Learn how to set the "borders" of your project in Site Audit using the modes selector.

Nick Churick avatar
Written by Nick Churick
Updated over a week ago

To create a new project in Site Audit, you need to set the crawl scope first. The scope defines the boundaries within which you want us to crawl your site.

You can use one of the three modes:

Exact URL - to crawl only the specified page

Path - to crawl any subfolder on a website

Domain - to crawl a specified domain excluding its subdomains and

Subdomains - for the most complete crawl.

Besides, you can instruct our tool to crawl only URLs that use a given protocol, such as HTTP or HTTPs:

The default http+https option includes both protocols.

While the protocol selector is pretty self-explanatory, modes might need some specific examples for better understanding.

Exact URL

This mode will give results for the specific URL only.

For example, if this mode is applied on https://www.example.net/cloud/ results from the URL specifically will be provided.

Path (domain.tld/path/*)

The Path mode includes every URL that begins with the exact path that you enter into the “Domain or path” field.

It allows you to limit the crawl to a given folder or subfolder of your website.

Example:

This configuration will include all website URLs that begin with ahrefs.com/blog, such as:

  • ahrefs.com/blog/who-links-to-my-site

  • ahrefs.com/blog/who-links-to-my-site/?utm_source=facebook

  • and any other URL in the /blog/ folder

However, www.ahrefs.com/blog will be excluded because it lies outside the defined path.

Domain (domain.tld/*) or (subdomain.domain.tld/*) 

The Domain mode takes the domain (or subdomain) from the path that you enter into the “Domain or path” field and includes all its URLs into the crawl.

It allows you to exclude URLs on subdomains (or sub-subdomains).

Example 1:

This configuration will include all URLs that begin with ahrefs.com, such as:

  • ahrefs.com itself

  • ahrefs.com/blog

  • ahrefs.com/blog/who-links-to-my-site/

However, www.ahrefs.com/blog will be excluded because it lies outside the defined domain.

Example 2:

This configuration will include all website URLs that begin with help.ahrefs.com, such as:

  • help.ahrefs.com itself

  • help.ahrefs.com/site-audit

  • help.ahrefs.com/site-audit?utm_source=facebook

However, ahrefs.com or ahrefs.com/blog/who-links-to-my-site/ will be excluded because they do not belong to the defined subdomain.

Subdomains (*/domain.tld/*) or (*subdomain.domain.tld/*) 

The Subdomains will grant the most complete crawl of your site.

The Domain mode uses the domain (or subdomain) from the path that you enter into the “Domain or path” field and includes all URLs on it and on all its subdomains (or sub-subdomains) into the crawl.

Example 1:

This configuration will include all URLs on ahrefs.com and on all its subdomains like:

Example 2:

This configuration will include all URLs on help.ahrefs.com and on all its sub-subdomains like:

  • help.ahrefs.com

  • help.ahrefs.com/site-audit

  • staging.help.ahrefs.com/site-audit

If you still have any questions about the scope of the crawl in Site Audit tool, just use our support chat. We’ll be glad to help you out ;) 

Did this answer your question?