Google updated its Search Central documentation to verify Googlebot and added documentation about user-driven bot visits, information that was missing from previous Googlebot documentation, which caused confusion for many years as some publishers used the IP -Blocked areas of legitimate visits.
Newly updated bot documentation
Google added new documentation categorizing the three different types of bots publishers should expect.
These are the three categories of Google Bots:
- Googlebot – search crawler
- Special case crawler
- User-triggered retrieval programs (GoogleUserContent)
The latter, GoogleUserContent, has long confused publishers because Google didn’t have documentation about it.
This is what Google says about GoogleUserContent:
“User-triggered polling programs
Tools and product features where the end user triggers a fetch.
For example, Google Site Verifier acts on a user’s request.
Since the fetch was requested by a user, these fetchers ignore the robots.txt rules.”
The documentation states that the reverse DNS mask shows the following domain:
In the past, I’ve been told by some in the SEO community that when a user viewed a website through a translation feature that used to be included in search results, bot activity was triggered from IP addresses associated with GoogleUserContent.com was, a feature that no longer exists in Google’s SERPs.
I don’t know if that’s true or not. It was enough to know that it was a visit from Google, triggered by users.
Google’s new documentation explains that bot activity from IP addresses associated with GoogleUserContent.com can be triggered by the Google Site Verifier tool.
But Google doesn’t say what else could trigger a bot from GoogleUserContent.com’s IP addresses.
The other change in the documentation is a reference to googleusercontent.com in the context of IP addresses associated with the domain name GoogleUserContent.com.
This is the new text:
“Make sure the domain name is either googlebot.com, google.com, or googleusercontent.com.”
The following text has also been added, which has been expanded compared to the old page:
“Alternatively, you can identify Googlebot by IP address by matching the crawler’s IP address to Google’s lists of crawler and fetcher IP ranges:
Special crawlers like AdsBot
Documentation for identifying Google bots
The new documentation finally has something about bots using IP addresses associated with GoogleUserContent.
Search engine marketers were confused by these IP addresses and assumed these bots were spam.
A 2020 Google Search Console Help discussion shows how confused people were about Google UserContent-related activity.
Many in this discussion correctly concluded that it wasn’t a Googlebot, but then incorrectly concluded that it was a fake bot pretending to be Google.
A user posted:
“The behavior I’m seeing from these addresses is very close (if not identical) to legitimate Googlebot behavior and affects several of our websites.
…If not, then this seems to indicate that there is widespread malicious bot activity by someone trying very hard to look like Google on our websites, which is worrying.”
After several replies, the person who started the discussion concludes that the GoogleUserContent activity was spam.
“…The Googlebots in question mimic the official user-agents, but as it stands, the evidence seems to indicate that they are fake.
I’ll block them for now.”
Now we know that bot activity from IPs associated with GoogleUserContent are not spamming or hacking bots.
They really are from Google. Publishers currently blocking IP addresses associated with GoogleUserContent should probably unblock them.
The current list of user triggered fetch IP addresses is available here.
Read the updated documentation from Google:
Review of Googlebot and other Google crawlers
Featured image from Shutterstock/Asier Romero