# How to Improve Brand Visibility in AI Search Engines

Improving brand visibility in AI search engines breaks into three layers. Layer 1 (technical infrastructure) is a 1-hour job. Layer 2 (content design) is a multi-week job. Layer 3 (off-site source-graph) is a multi-quarter job that compounds. Three of the most common tactics — FAQPage schema, "increase content length", JSON-LD as a universal AI signal — do not appear in the validated set.

## The framework — three things to optimise, in order

Generative AI engines do not work like Google's blue-link results. The clearest published model is Zhang Kai & Yao Jingang 2026 (arXiv:2604.25707v1), which separates the process into **citation selection** (does the engine retrieve your page?) and **citation absorption** (does the page's language actually shape the answer?). User-visible mention is a third stage downstream of both.

Tw93's 2026 ChatGPT instrumentation showed the engine retrieves ~100 pages per query, but only ~15% surface in the answer. Three different metrics, three different optimisation tactics.

1. **Layer 1 — Technical infrastructure** (hours)
2. **Layer 2 — Content design** (weeks)
3. **Layer 3 — Off-site source-graph** (quarters, compounding)

## Layer 1 — Technical infrastructure (1 hour)

The selection floor. [Vercel's 2025 crawler study](https://vercel.com/blog/the-rise-of-the-ai-crawler) confirmed GPTBot, ClaudeBot and PerplexityBot fetch raw HTML, no JavaScript execution.

**1. Granular robots.txt.** Modern AI engines run multiple user agents. Explicit Allow rules for OAI-SearchBot, Claude-SearchBot, PerplexityBot, ChatGPT-User. See [AI crawler readiness](https://www.eastbound.ai/ai-crawler-readiness/).

**2. llms.txt and llms-full.txt.** Positive index. See [llms.txt vs robots.txt](https://www.eastbound.ai/llms-txt-vs-robots-txt/) and [Markdown alternates guide](https://www.eastbound.ai/markdown-alternates-guide/).

**3. JavaScript-only rendering kills you.** SSR anything you want cited. Diagnostic: `curl` your URL — if you don't see body copy, neither does the engine.

**4. IndexNow + sitemap + canonical tags.** [IndexNow setup guide](https://www.eastbound.ai/indexnow-setup-guide/).

**5. Avoid the seven blocking mistakes** — [AI crawler blocking mistakes](https://www.eastbound.ai/ai-crawler-blocking-mistakes/).

> **Layer 1 deliverable.** Pages return 200 to all major AI bot user-agents, render full HTML body without JavaScript, ship llms.txt + Markdown alternates, ping IndexNow on every publish.

## Layer 2 — Content design (multi-week)

[Aggarwal et al., KDD 2024](https://arxiv.org/abs/2311.09735) ran a 10,000-query benchmark across nine tactics. Three produced statistically reliable lifts:

| Tactic | Citation lift |
|---|---|
| Adding authoritative citations | +115% |
| Adding direct quotes from credible sources | +43% |
| Adding relevant statistics with named sources | +33% |

Notably absent from validated set: FAQ format, FAQPage schema, generic "increase content length."

**1. Length sweet spot: 1,000–3,000 words** with 10+ headings. Low-cited pages average 170 words; high-cited average ~2,000.

**2. Specificity beats fluency.** Real numbers, dated comparisons, named entities, clear definitions. Cited 50%+ more than vague pages.

**3. Encyclopedia-style explainers outperform news 3× per citation** in published samples.

**4. What does NOT work:**

- **FAQPage schema.** SE Ranking 129K-domain study (Search Engine Journal, 2025): FAQ-schema pages averaged 3.6 ChatGPT citations vs 4.2 without. Williams-Cook 2026 controlled test: no extraction advantage over visible Q&A. Skip it.
- **JSON-LD as universal AI signal.** ChatGPT and Perplexity tokenise JSON-LD as plain text. Bing/Copilot is the exception (Microsoft's Fabrice Canel publicly confirmed at SMX Munich, March 2025). Keep schema for that bonus only.
- **Padding to "increase content length."** Lowers signal-to-noise.
- **User-Agent sniffing for bots.** Cloaking. Penalised.

## Layer 3 — Off-site source-graph (multi-quarter)

The compounding moat. The single highest-leverage finding:

> **Brands cited by third parties are referenced ~6.5× more often than brands cited only on their own domain.**

**Western priorities:**

- **Wikipedia** — 21% of DeepSeek responses across our luxury-handbag panel; highest-mentioned Western source. See [Wikipedia AI visibility](https://www.eastbound.ai/insights/wikipedia-ai-visibility/).
- **Reddit, Hacker News, GitHub** — community trust signals.
- **YouTube** — 20% of DeepSeek responses despite the geo-block. See [YouTube AI visibility](https://www.eastbound.ai/insights/youtube-ai-visibility/).
- **Vertical industry pubs** — Search Engine Land, SEJ, Ahrefs blog, Semrush blog cite each other constantly.

**Chinese priorities (different ecosystem entirely):**

- **百度百科** — Mainland encyclopedia anchor.
- **知乎** — long-form Q&A. See [Zhihu AI visibility](https://www.eastbound.ai/insights/zhihu-ai-visibility/).
- **小红书 (Xiaohongshu)** — lifestyle / B2C. See [Xiaohongshu AI visibility](https://www.eastbound.ai/insights/xiaohongshu-ai-visibility/).
- **SMZDM (什么值得买)** — commerce aggregator; high citation rate at aspirational tiers, low at ultra-luxury. See [SMZDM AI visibility](https://www.eastbound.ai/insights/smzdm-ai-visibility/).
- **WeChat 公众号 + 36氪 / 虎嗅** — Mainland tier-2 vertical media.

For the long version see [Traditional SEO Won't Get You Into Chinese AI Answers](https://www.eastbound.ai/blog/off-site-substrate.html).

## How to measure progress

Three metrics, each separately:

1. **Selection rate** — % of relevant prompts pulling your domain into the engine's source pool.
2. **Absorption rate** — % of selections producing extractable content in the answer.
3. **Mention rate** — % of relevant prompts producing a user-visible brand mention.

Eastbound's [methodology](https://www.eastbound.ai/methodology/) measures all three. The free [AI visibility audit](https://www.eastbound.ai/ai-visibility-audit/) runs your domain through DeepSeek, Qwen, Doubao on a Mainland-CN consumer prompt panel.

## Realistic timeline

| Layer | Time | Compounding? | Single biggest blocker |
|---|---|---|---|
| 1. Technical infrastructure | 1 hour to 1 day | No — set-and-forget | JavaScript-only rendering |
| 2. Content design | 2–8 weeks per page | Per-page; cross-page for evergreens | Editorial discipline |
| 3. Off-site source-graph | 1–3 quarters | Strongly | Relationship-driven, can't be bought |

Most brands rush Layer 2, skip Layer 3, never circle back to Layer 1. Opposite order is correct.

## China is a separate execution problem

Framework above is engine-agnostic. Source-graph (Layer 3) is not. In our 540-call panel (May 2026), top-15 cited-source overlap (Jaccard) between any two Chinese engines was 0.20–0.30. A source-graph plan built for ChatGPT cannot be ported to DeepSeek. See [China AI visibility for global brands](https://www.eastbound.ai/china-ai-visibility/).

## Related reading

- [Eastbound's measurement methodology](https://www.eastbound.ai/methodology/)
- [What is generative engine optimization?](https://www.eastbound.ai/what-is-generative-engine-optimization/)
- [AI crawler readiness](https://www.eastbound.ai/ai-crawler-readiness/)
- [llms.txt vs robots.txt](https://www.eastbound.ai/llms-txt-vs-robots-txt/)
- [Markdown alternates guide](https://www.eastbound.ai/markdown-alternates-guide/)
- [Traditional SEO won't get you into Chinese AI answers](https://www.eastbound.ai/blog/off-site-substrate.html)
- [DeepSeek vs Qwen vs Doubao: source-mix study](https://www.eastbound.ai/blog/three-chinese-ais.html)

---

Run a free audit at https://www.eastbound.ai/ai-visibility-audit/.
