3. Technical Setup

<aside>

Your content might be excellent, but if AI systems can't access, crawl, and understand it, you won't get cited. This section covers the technical foundations that enable AI visibility.

</aside>

<aside>

Most sites have at least one issue here - and fixing these often delivers quick wins.

</aside>

Part 1: AI Crawler Access (robots.txt)

Your robots.txt file controls which bots can access your site. Many sites are unknowingly blocking AI crawlers - or haven't explicitly allowed them.

Understanding AI crawlers

Each AI platform uses different crawlers:

Crawler	Platform	Purpose
GPTBot	OpenAI	Training data and ChatGPT browsing
ChatGPT-User	OpenAI	Real-time browsing when users ask ChatGPT to search
ClaudeBot	Anthropic	Claude's web access
PerplexityBot	Perplexity	Real-time search and citation
Google-Extended	Google	AI training (separate from Googlebot)
Googlebot	Google	Standard search indexing (also feeds AI Overviews)
Bingbot	Microsoft	Bing search and Copilot

<aside>

Important distinction:

Blocking GPTBot or Google-Extended prevents your content being used for AI training
Blocking ChatGPT-User or PerplexityBot prevents your content being cited in real-time responses
You might want to allow citation while blocking training - or allow both </aside>

Check your current robots.txt

<aside>

Visit: yourdomain.com/robots.txt

</aside>

Common problems

Problem 1: Blanket block

User-agent: * Disallow: /

This blocks everything from everyone - including AI crawlers.

Problem 2: Blocking AI crawlers specifically

`User-agent: GPTBot Disallow: /

User-agent: ClaudeBot Disallow: /`

Some sites added these blocks during AI hype/fear cycles. If you want AI visibility, remove them.

Problem 3: No explicit AI crawler rules