Registry

Below is the current registry of known AI crawler operators, their verification methods, and publication status. Operators are listed in order of certainty: those with embedded ranges can be verified here, followed by those with published endpoints (verified in production), and finally those with no published method.

Operator Purpose UA token Publishes IP ranges Verifiable here
GPTBot
OpenAI
AI training crawler GPTBot published yes
ChatGPT-User
OpenAI
On-demand fetch (user prompt) ChatGPT-User published in prod
OAI-SearchBot
OpenAI
Search index OAI-SearchBot published yes
PerplexityBot
Perplexity
Search index PerplexityBot published yes
Claude-User
Anthropic
On-demand fetch (web search/fetch) Claude-User published yes
ClaudeBot
Anthropic
AI training crawler ClaudeBot none no
Claude-SearchBot
Anthropic
Search index (new 2026) Claude-SearchBot none no
Googlebot
Google
Search index Googlebot published in prod
Google-Extended
Google
AI training token (Gemini) Google-Extended published in prod
Bingbot
Microsoft
Search index (powers Copilot) bingbot published yes
Applebot
Apple
Search index / Apple Intelligence Applebot published yes
Bytespider
ByteDance
AI training crawler Bytespider none no
Meta-ExternalAgent
Meta
AI training crawler Meta-ExternalAgent none no
CCBot
Common Crawl
Open web corpus CCBot DNS only in prod
Amazonbot
Amazon
Search / Alexa / AI Amazonbot DNS only in prod

A claim is provable only if the operator publishes its own IP ranges or reverse DNS records. "In prod" means the endpoint is published and pulled live in production. Operators with no published ranges or DNS records cannot be verified by anyone.