Our system is wired to identify spammy subdomains early in, so they don’t even get into our database in the first place.
For example, if a website has tons of subdomains, we will only download the top part of the list and stop as soon as the quality drops beyond a certain threshold.
So basically, we don’t need to clean our database from spammy networks because we don’t even index them in the beginning.