<aside>
Your content might be excellent, but if AI systems can't access, crawl, and understand it, you won't get cited. This section covers the technical foundations that enable AI visibility.
</aside>
<aside>
Most sites have at least one issue here - and fixing these often delivers quick wins.
</aside>
Your robots.txt file controls which bots can access your site. Many sites are unknowingly blocking AI crawlers - or haven't explicitly allowed them.
Each AI platform uses different crawlers:
| Crawler | Platform | Purpose |
|---|---|---|
| GPTBot | OpenAI | Training data and ChatGPT browsing |
| ChatGPT-User | OpenAI | Real-time browsing when users ask ChatGPT to search |
| ClaudeBot | Anthropic | Claude's web access |
| PerplexityBot | Perplexity | Real-time search and citation |
| Google-Extended | AI training (separate from Googlebot) | |
| Googlebot | Standard search indexing (also feeds AI Overviews) | |
| Bingbot | Microsoft | Bing search and Copilot |
<aside>
Important distinction:
<aside>
Visit: yourdomain.com/robots.txt
</aside>
Common problems
Problem 1: Blanket block
User-agent: * Disallow: /
This blocks everything from everyone - including AI crawlers.
Problem 2: Blocking AI crawlers specifically
`User-agent: GPTBot Disallow: /
User-agent: ClaudeBot Disallow: /`
Some sites added these blocks during AI hype/fear cycles. If you want AI visibility, remove them.
Problem 3: No explicit AI crawler rules