Below is the current registry of known AI crawler operators, their verification methods, and publication status. Operators are listed in order of certainty: those with embedded ranges can be verified here, followed by those with published endpoints (verified in production), and finally those with no published method.
| Operator | Purpose | UA token | Publishes IP ranges | Verifiable here |
|---|---|---|---|---|
| GPTBot OpenAI | AI training crawler | GPTBot | published | yes |
| ChatGPT-User OpenAI | On-demand fetch (user prompt) | ChatGPT-User | published | in prod |
| OAI-SearchBot OpenAI | Search index | OAI-SearchBot | published | yes |
| PerplexityBot Perplexity | Search index | PerplexityBot | published | yes |
| Claude-User Anthropic | On-demand fetch (web search/fetch) | Claude-User | published | yes |
| ClaudeBot Anthropic | AI training crawler | ClaudeBot | none | no |
| Claude-SearchBot Anthropic | Search index (new 2026) | Claude-SearchBot | none | no |
| Googlebot | Search index | Googlebot | published | in prod |
| Google-Extended | AI training token (Gemini) | Google-Extended | published | in prod |
| Bingbot Microsoft | Search index (powers Copilot) | bingbot | published | yes |
| Applebot Apple | Search index / Apple Intelligence | Applebot | published | yes |
| Bytespider ByteDance | AI training crawler | Bytespider | none | no |
| Meta-ExternalAgent Meta | AI training crawler | Meta-ExternalAgent | none | no |
| CCBot Common Crawl | Open web corpus | CCBot | DNS only | in prod |
| Amazonbot Amazon | Search / Alexa / AI | Amazonbot | DNS only | in prod |
A claim is provable only if the operator publishes its own IP ranges or reverse DNS records. "In prod" means the endpoint is published and pulled live in production. Operators with no published ranges or DNS records cannot be verified by anyone.