In simplest terms, a web crawler (or spider) is a computer program that browses the web on a step by step basis.
It starts with indexing an initial list of URLs (so-called “seeds”).
When it finds all the hyperlinks that the seeds contain, it adds them to the list of URLs which it will index next. Once added, the crawler visits those new pages and repeats the whole process all over again.
Our AhrefsBot crawls the web strictly following the rules from robots.txt. More details here: https://ahrefs.com/robot