The on-site checklist that wins on Google barely predicts brand recall in Mainland AI. We audited the brands DeepSeek, Qwen, and Doubao actually surface — and found the highest-recall ones often have broken sites, near-empty structured data, no Wikipedia entry. Their visibility lives off-site, in community discourse, KOL coverage, and e-commerce depth. Off-site SEO for AI search is the missing half of most GEO playbooks.
What "off-site substrate" means
Off-site substrate is the body of third-party evidence — Zhihu threads, KOL posts, Tmall reviews, Wikipedia entries — that an AI engine reaches for when answering a brand-recommendation prompt. It sits outside the brand's own website. For Mainland-Chinese consumer queries, it's where most of the brand-recall signal lives.
sameAs)sameAs)Schema density doesn't predict AI mention rate
If you plotted every brand we audited with on-site schema density on the x-axis and AI mention rate on the y-axis, you wouldn't get a positive slope. You'd get a near-flat scatter: the highest-density site in our audit is among the lowest-mentioned, and the lowest-density sites are among the most-mentioned.
The winning brands averaged roughly half the schema markup of the controls we picked as the most aggressively schema-tagged sites in the same niches. Heaviest-schema control: eleven distinct schema types, seven sameAs links, mention rate 20%. Most-mentioned brands in the panel: zero or one schema types, zero sameAs, mention rates above 80%.
The brands at the extremes make it sharper. Diptyque and Dr.G, two of the highest-mentioned brands in our panel, had 0–1 schema types and 0 sameAs links. Tizo2, our heaviest-schema control (11 types, 7 sameAs), was mentioned in only 20% of its niche's answers. Gateron, mentioned in 92%, has one schema type (Organization) and zero sameAs links.
On-site schema density is necessary baseline work, but it's not sufficient on its own to drive AI brand recall. Several of the most-mentioned brands have a near-empty on-site signal, meaning something else off-site is doing the lifting. Our control sample is small enough that we can't claim schema is useless — we can claim it isn't enough by itself. See methodology.
Why brands with thin websites still surface in Chinese AI
The cleanest way to make the contradiction concrete is to look at three brands whose on-site footprint reads as a near-empty checklist. All three are mentioned in 60–92% of responses across the AIs we tested.
Kailh (凯华): the brand on a small-business website builder
Kailh is a Guangdong electronics manufacturer (parent: Dongguan Kaihua Electronics Co.). Its homepage at http://www.kailh.com/ returns 200 OK, ~37KB of Chinese-language B2B content, built on a small-business website-builder template.
On-site signal: no structured data, no sameAs links, no Open Graph metadata, no canonical link, no hreflang. The HTTPS configuration is broken; the server presents an expired certificate for the wrong domain. Plain HTTP works fine.
This brand is mentioned in 92% of answers for the keyboard niche. The recall isn't anchored on the website — it's anchored on community discourse: long-form Zhihu switch-comparison threads, Bilibili reviewer videos, Tmall product-detail pages with hundreds of reviews per SKU.
Substrate: community forum + Tmall · structured data: 0 · sameAs: 0 · HTTPS: broken · Mention rate: 92%
Gateron (佳达隆): a working site with a near-empty schema profile
Gateron is the second 92%-mention keyboard brand, and the more interesting on-site case. Its homepage at https://www.gateron.com/ returns 200 OK with ~595KB of well-structured HTML, full Open Graph and Twitter Card metadata, a clean canonical link, and 6,500+ words of product copy. By every "is this a real brand site?" check, Gateron passes.
On-site GEO signal: one schema type (Organization) and zero sameAs links. Well below the 2.92 / 1.23 average of our winners, and an order of magnitude below Tizo2 (11 schema types, 7 sameAs, 20% mention rate).
Gateron and Tizo2 are the cleanest A/B in the entire study. One has 11× the schema types and an order of magnitude more sameAs links. The other has 4.6× the AI mention rate.
Substrate: community forum + Tmall · schema: 1 type · sameAs: 0 · Mention rate: 92%
Naturehike (挪客): thin schema, no encyclopedic anchor, 64% recall
Naturehike is a Mainland indie outdoor brand. Its homepage returns 200 OK with 911KB of content. On-site GEO signal: almost no structured data. Off-site: no English Wikipedia, no Chinese Wikipedia, no Wikidata record.
Mentioned in 64% of DeepSeek answers for the indie-camping niche, ahead of multiple international brands with full encyclopedic and schema presence. The substrate carrying it: Tmall + Xiaohongshu + Zhihu + Bilibili (long-form community reviews, KOL camping vlogs, product-detail-page review depth). None of that lives on naturehike.com.
Substrate: community discourse + e-commerce + KOL · structured data: ~0 · sameAs: 0 · Wikipedia: none · Mention rate: 64%
None of these brands are running an AI-visibility playbook. None ran a schema-markup project. Their AI mention rates are a side-effect of the substrate they sit in — community discourse, KOL coverage, e-commerce review depth — exactly the kind of third-party evidence surface these engines repeatedly name when answering recommendation prompts.
The unit of recall is the flagship-product phrase, not the brand name
The most counter-intuitive finding in the panel: brands at high recall don't surface as a bare brand name. They surface alongside a small set of named, repeatable product phrases.
| Brand | Niche | Flagship phrase | Recurrence (of 125) |
|---|---|---|---|
| Kailh (凯华) | Mech keyboards | Kailh Box Red Switch (Kailh Box红軸) | 14 |
| Gateron (佳达隆) | Mech keyboards | "Cost-Performance King" Switch (性價比之王 軸體) | 11 |
| TTC | Mech keyboards | TTC Gold Powder Switch (金粉軸) | 10 |
| Winona (薇諾娜) | Mineral sunscreen | Winona Clear Sunscreen (清透防曬乳) | 6 |
The interpretation: flagship-product names act as repeatable evidence anchors in community discourse. Gateron is not surfaced because its site is heavily marked up; it's surfaced because community discourse repeatedly co-occurs Gateron with "Cost-Performance King" Switch. The named flagship is the unit of recall, not the bare brand name. Optimization strategies built around bare brand-name SEO miss this anchor entirely.
The off-site substrate that matters is niche-conditional
If on-site density isn't the lift, what is? The audit data points to two off-site substrates that are present in the high-recall brands — but which one matters depends entirely on the niche.
International-corpus niches need encyclopedic anchors
In single-malt whisky and niche perfume, encyclopedic presence dominates. Macallan (麦卡伦) and Ardbeg (阿貝) carry full English + Chinese Wikipedia plus Wikidata. Glenfiddich (格兰菲迪) has English Wikipedia and Wikidata but no Chinese entry. Encyclopedic presence dominates here because the brand discourse genuinely lives in English-language reference works.
Do you need a Wikipedia page for AI to recommend your brand? In international-corpus niches, yes — it's one of the strongest signals we measured. In Mainland-saturated niches, no — the brands winning at high recall here have no encyclopedic presence at all.
Mainland-saturated niches need community-discourse density
In mech keyboards, craft beer, indie camping, handmade leather — the substrate that lifts is community-discourse density. Zhihu long-form, Xiaohongshu KOL imprint, Bilibili review depth, Tmall + JD product-catalogue depth. The brands surfacing at high mention rates here routinely have no encyclopedic presence at all. Strategies built around Wikipedia-first miss the substrate entirely.
Niche type beats AI choice for predicting Mainland-brand share
The two niche panels we tested let us check whether the same kinds of niches produce the same Mainland-vs-international brand mix across all three AIs. The replication was unusually clean.
| Category type | Panel 1 niche | Mainland brands in top-8 (DS / Qw / Db) | Panel 2 niche | Mainland brands in top-8 (DS / Qw / Db) |
|---|---|---|---|---|
| International-corpus | Niche perfume | 0 / 0 / 0 | Single-malt whisky | 0 / 0 / 1 |
| Mainland-saturated | Mech keyboard switches | 5 / 5 / 4 | Craft beer | 7 / 6 / 5 |
| Beauty (intl + 1–5 CN) | Mineral sunscreen | 2 / 1 / 4 | Vegan skincare | 2 / 1 / 5 |
| Indie / craft | Ceramic dinnerware | 7 / 7 / 8 | Handmade leather | 3 / 4 / 2 |
In 3 of 4 niche types, the Mainland-vs-international share replicated within roughly ±5pp across both panels. International-corpus niches returned 0–1 Mainland brands across all three AIs. Mainland-saturated niches returned 5–7. The Doubao Mainland-content tilt held: about +10pp on top of the category baseline.
Practical reading: niche type captured more of the variance in Mainland-brand share than AI choice did. A brand in an international-corpus niche didn't move into a Mainland-substrate result by switching AIs. A brand in a Mainland-saturated niche was visible across all three AIs as long as its community substrate was there.
What this means for your China AI visibility strategy
Three takeaways follow.
Schema markup and on-site SEO are necessary baseline work for Chinese AI visibility. Keep doing them. The finding above is that schema density isn't sufficient — not that it's useless. A clean on-site footprint still does work for crawler legibility, retrieval, and absorption-readiness on Bing/Copilot. The question is what to ship alongside it.
For Mainland-saturated niches, the off-site investment is in community substrate. Sustained KOL programmes on Xiaohongshu and Bilibili, long-form Zhihu presence written by named experts, depth in Tmall + JD product-detail review surfaces. Wikipedia is a bonus, not a precondition. The brands winning here have no encyclopedic presence at all.
For international-corpus niches, the priority order is the opposite. Verified Wikipedia / Wikidata entity pages first; KOL programmes second; community-forum work has limited additional lift here, because the AI isn't reading that substrate for this niche.
Each row below is a highest-priority intervention hypothesis to test, not a guaranteed-causal recommendation. Pair any of them with a before/after measurement at T+12 weeks.
| If your brand is in… | Evidence surface the AI names | Intervention hypothesis to test |
|---|---|---|
| International-corpus niche (whisky, niche perfume, luxury watches) | Wikipedia, Wikidata, global English-language reference content | Verified entity pages first. KOL programmes a distant second. Community-forum work has limited additional lift here. |
| Mainland-saturated niche (mech keyboards, craft beer, indie camping, handmade leather) | Zhihu long-form, Xiaohongshu KOL, Bilibili reviews, Tmall + JD product-catalogue depth | Sustained KOL programme + product-catalogue content density. Wikipedia is a bonus, not the highest-priority target. |
| Beauty / lifestyle (sunscreen, skincare, fragrance) | Both substrates: encyclopedic for ingredient credibility, community for product credibility | Pursue both, weighted toward Xiaohongshu KOL on the lifestyle end. |
| Indie / craft (ceramics, leather, indie dinnerware) | Variable; depends on whether the niche has CN-domestic depth | Read the substrate before committing budget. CN-saturated craft niches behave like Mainland-saturated; thin craft niches behave more like international-corpus. |
A brand audit that scores a checklist green and ignores the substrate misses the half of the work the engine is actually reading.
Where Eastbound comes in
We see brand teams discover this gap when their internal audits read green and a Mandarin tester returns competitor names instead of theirs. The fix isn't a single content sprint — it is a re-mapping of what the engine reads, what it surfaces, and where the brand sits in the answer set.
That mapping is the kind of work Eastbound does. If your team is sitting on the green-audit / red-engine gap, run the free China AI visibility audit on your domain or book an intro call.
Methodology
- Sample. 750 brand-recommendation calls (25 prompts × 5 reps × 3 engines × 2 niche panels), across 10 Mainland-Chinese consumer niches: mechanical keyboard switches, mineral sunscreen, niche perfume, ceramic dinnerware, natural wine, craft beer, indie camping, vegan skincare, single-malt whisky, handmade leather.
- Engines. DeepSeek (
deepseek-chat); Qwen (qwen-plus, DashScope international); Doubao (seed-2-0-lite-260228, BytePlus ModelArk international, Lite tier). All API-mode, default decoding. Chat-with-Search browsing surface is a separate study. - On-site audit. 20 winning brands + 5 controls attempted in Panel 1; 13 winners + 2 controls fetched cleanly. Counter-examples are strong (Kailh, Gateron, Naturehike all hit 80%+ mention rates with near-empty schema), but the control sample is too small to claim schema is useless. We can claim it isn't sufficient.
- What we measured. Brand mention rate per (prompt × rep) cell; co-occurring product-phrase frequency; on-site schema /
sameAs/ fetchability; off-site Wikipedia / Wikidata presence. This is descriptive measurement of LLM behavior; it is not a causal claim about what publishing on a substrate would do to mention rate. - What we did not measure. Sales, conversion, attributable revenue. ChatGPT / Claude / Gemini / Perplexity / ERNIE / Yuanbao — not in this panel. Chat-with-Search-ON browsing surfaces — separate study.
- Reliability. Structural pattern (international-corpus → encyclopedic; Mainland-saturated → community substrate) replicated across two non-overlapping niche panels. Doubao long-tail source-ranking is less stable than top-5 (top-5 κ = 1.00; top-15 κ = 0.46) — treat long-tail Doubao findings with that caveat.
Per-record JSONs at ~/Documents/Claude/GEO/research/2026-04-29-deepseek-cn-niches/data/site_audits/ and adjacent panels. Replication script and prompt panels available on request.