The easy story to write about Chinese AI brand defense: "Doubao is unreliable. Doubao swaps brands 39% of the time when challenged. DeepSeek and Qwen barely move." That story is technically correct and strategically useless. It buries a more important finding underneath it. Split the same data by category, and the engine-spread story collapses.
What "brand defense against AI competitor challenge" means
Brand defense against AI competitor challenge is the question of whether — when a buyer asks an AI engine for a recommendation and then counters with a named competitor ("isn't [Brand B] better than [Brand A]?") — the engine holds its original recommendation or switches to the competitor. The share of challenges where the engine switches is the brand's exposure. Eastbound's 108-event panel measures this across DeepSeek, Qwen, and Doubao for four luxury categories.
| Category | DeepSeek switch | Qwen switch | Doubao switch | Spread |
|---|---|---|---|---|
| Premium European luxury cars (¥600K–1.2M) | 0% | 0% | 89% | 89pp |
| Men's mechanical watches (¥30K–200K) | 0% | 0% | 44% | 44pp |
| International hotel loyalty programs | 22% | 11% | 22% | 11pp |
| High-end anti-aging skincare (¥1500+) | 0% | 0% | 0% | 0pp |
Doubao moved freely on cars (89% switch) and held tight on skincare (0%). Same engine. Same prompt structure. Same Mandarin question stem ("isn't [Brand B] better than [Brand A]?"). What changed wasn't the engine. What changed was whether the engines' response pattern treated the recommended brand as occupying a distinct narrative slot. That's the finding worth leading with.
Three structural properties of brands AI doesn't swap out
We pulled the per-brand risk-surface cards for the skincare panel and looked at what made them collectively unchallengeable. Three structural properties showed up across all five brands.
1. A shared category triad of positive anchors
Every skincare brand we tested had positive claims clustered in the same three slots: craft / heritage, outcome / efficacy, and experience / ritual.
| Brand | Heritage anchor | Efficacy anchor | Experience anchor |
|---|---|---|---|
| La Mer | Miracle Broth (神奇活性精萃) | Barrier repair, hydration | Sensory ritual, emotional value |
| La Prairie | Swiss medical / clinical lab | Clinically-validated anti-aging | Sensory luxury, ritual feel |
| Sisley | Plant-derived, pure formulation | Sensitive-skin tolerance, fine-line repair | Aroma / texture refinement |
| Helena Rubinstein | Core science / proprietary tech | Damaged-skin recovery, anti-aging | Premium application ritual |
| Guerlain | French heritage, traditional craft | Clinically-tested firmness, radiance | French-elegant identity |
The triad is shared. The slot each brand owns inside the triad is distinct. Challenge with "isn't La Prairie better than La Mer?" and AI's response pattern has no clean place to swap to — the strengths don't overlap. Compare luxury cars: BMW's positives (driving pleasure, engine reliability, prestige) and Mercedes's (ride comfort, prestige, residual value) share multiple slots, and the differentiated slots sit on the same axis. Challenge with "isn't Mercedes better — for comfort?" and the engines have somewhere to go.
2. Negatives that are category-wide, not brand-specific
Across all five skincare brands, the top three durable negatives were the same: cost / value-for-money (性价比), fit (skin-type concerns: oiliness, sensitivity, suitability), and efficacy questions (functional claims relative to evidence). Five brands, three shared liability slots. The challenger inherits the same liabilities. "Isn't La Prairie better?" — La Prairie has the same cost, fit, and efficacy critiques. The challenge has nothing to grip on.
In luxury cars, BMW's durable negatives include brand-specific concerns (electronic-system failures, suspension hardness, residual-value spread) that the engines do not surface as strongly for Mercedes. The challenger has a clean angle.
3. Anchors tied to specific, irreducible attributes
"Drives well" is rebuttable. "Made with neroli oil and signature Miracle Broth (神奇活性精萃) since 1965" is not — it's a fact, not a positioning claim. The brands the engines couldn't trade away had positives anchored to specific named ingredients, year-of-founding, country-of-lab, signature processes. Generic positioning ("luxury", "premium") collapses under challenge; specific positioning does not.
These three properties are interactive. A specific anchor (3) inside a shared category triad (1) where the negatives are common to all rivals (2) — that's the structural shape of an un-switchable brand. It's also the kind of shape AI's response pattern has the easiest time treating as distinct.
AI's self-attribution names your defensive content gap
Half the study asked the engines, for every clustered narrative, where the judgment came from. The asymmetry between the negative and positive source mix is the operational disclosure of the study.
| Source type | Negative (% of citations) | Positive (% of citations) | Δ (pp) |
|---|---|---|---|
| KOL (bloggers, influencers) | 18.7% | 13.3% | +5.3 |
| Community (Xiaohongshu, Zhihu, Dianping) | 16.3% | 13.6% | +2.7 |
| Review (professional) | 12.2% | 15.2% | −3.0 |
| News media | 12.1% | 12.4% | −0.3 |
| Forum (enthusiast / owner clubs) | 9.6% | 7.6% | +2.0 |
| Marketplace (e-commerce, secondary) | 7.9% | 6.1% | +1.8 |
| Official (brand-controlled) | 7.5% | 16.7% | −9.2 |
| Certification / third-party rating | 7.2% | 7.5% | −0.3 |
Read the table backwards. The engines, in self-attribution, name CN community / KOL / forum surfaces as the source of the negative framing — Zhihu long-form replies, Xiaohongshu owner reviews, owner-club forum threads, vertical-enthusiast communities. These are precisely the surfaces where most foreign incumbents have no curated presence at all. Their content investment lives in the positive stack: brand .com, English-language professional reviews, international certifications — the same surfaces the engines already cite for positive framing.
Defensive content has to land where the negatives self-attribute
The directly actionable disclosure: investing only in the positive stack reinforces what AI's response pattern already says nicely about you. It does not dilute what it says critically. Defensive content has to land where the negative framing self-attributes, or it doesn't move the negative narrative.
Audit the positioning triad first
Before any source-seeding work, the brand needs a publishable triad of positive anchors. Pillar page on the brand .com per anchor (1,000–2,500 words, encyclopedia tone, specific names and dates), one per slot in the category triad. Without it, downstream Tier-3 community work attracts spam-classifier risk, not authority transfer. The skincare panel's 0% switch rate is, structurally, what well-anchored Tier-1 looks like in the engines' response pattern.
Anchor on Baidu Baike and Wikipedia zh-CN
Baidu Baike (百度百科), Wikipedia Chinese (维基百科 zh-CN), Wikidata zh-CN. These are the highest-trust signals short of academic citation, and they survive across LLM training cycles. When the engines have a Baike entry to ground a recommendation on, "isn't [Brand B] better?" has a counter-anchor to retrieve. This is the single most under-invested layer for foreign brands in mainland-CN AI.
Saturate the negative-source surface
Defensive proof-points have to land where the engines self-attribute their negatives: Zhihu long-form expert answers from named professionals (not anonymous brand voice), Xiaohongshu owner-experience posts with verifiable details, owner-forum / vertical-community threads with substantiation. The deliverable is not "make AI say nice things." It is to seed verifiable counter-evidence in the surface AI's self-attribution already names — so the next time that surface is sampled, the cluster is contested.
Convert brand-specific negatives into category-shared ones
The skincare brands' shared liability ("expensive, may not suit your skin, efficacy is debated") is what makes them collectively unchallengeable. Where the brand has a brand-specific durable negative, the goal is not to deny it. The goal is to publish counter-evidence at the source-stack tier where the negative concentrates AND demonstrate the issue is a category-wide engineering trade-off (e.g., premium European cars run more electronics than Japanese rivals for a quantifiable performance reason). Brand-specific liabilities become category-shared facts.
Doubao is the leading indicator — re-measure quarterly per engine
Doubao moves first when narrative weakens, and consolidates first when narrative strengthens. Quarterly switch-rate measurement on the brand's own competitive set, per engine, is the closing-the-loop metric. Qwen and DeepSeek are the trailing indicators — when they start defending the brand under challenge, the work has compounded.
Two anonymized comparisons from the dataset show what the property differences look like in practice.
Brand A — high-end skincare. Negative source mix: 47% CN, 28% global, 25% US-EU. Negatives clustered in cost, fit, efficacy — the category-shared triad. Positive anchors: signature ingredient (named, dated), barrier-repair efficacy (clinical claim), sensory ritual (experience). Switch rate across all three engines: 0%.
Brand B — premium European luxury car. Negative source mix: 79% CN, 14% US-EU, 7% global. Negatives clustered in maintenance cost, electronic reliability, suspension comfort — partly category-shared, partly brand-specific. Positive anchors: driving experience (rebuttable), engine heritage (specific), prestige (generic, shared with rivals). Switch rate on Doubao: 22%; on DeepSeek and Qwen: 0%.
Same study, same prompts, same protocol. Brand A's negative-source mix is dispersed across regions; Brand B's concentrates where the brand has lower presence. Brand A's positives are anchored to specific irreducible attributes; Brand B's include a generic shared one. Brand A's negatives are category-wide; Brand B's include brand-specific items. Three property differences map to a 22pp switch-rate gap on the engine that matters most. None of the differences are about which engine the buyer used.
Where Eastbound comes in
Per-brand risk-surface cards aggregate the durable negatives across all three engines, the engine-specific defense behavior, and the source-stack feeding both narratives. From the card: a 90-day source-seeding plan — Tier-1 anchor pages on the brand site, Tier-3.5 Baike / Wikipedia work, Tier-3 Zhihu / Xiaohongshu / forum content placed where the negatives concentrate. Quarterly re-measurement on the same competitive set tracks switch-rate change.
If your team needs that mapping for your category, run the free China AI visibility audit on your domain or book an intro call.
Methodology
- Sample. 1,332 raw responses across DeepSeek, Qwen, Doubao. Four categories — international hotel loyalty programs; premium European luxury sedans / SUVs (¥600K–1.2M); men's mechanical watches (¥30K–200K); high-end anti-aging skincare (¥1500+). Top-5 brands per category drawn from in-study pre-survey. B1 risk-surface module: 720 calls. B2 brand-defense probe: 216 calls (multi-turn). B3 source-attribution module: 360 calls.
- Engines. DeepSeek (
deepseek-chatauto-routes todeepseek-v4-flash); Qwen-Plus on DashScope international (dashscope-intl.aliyuncs.com); Doubao seed-2-0-pro-260328 on BytePlus ModelArk international (ark.ap-southeast.bytepluses.com). Web search off, default API state. All within a 24-hour run window per module; model IDs verified at module open and close — no drift detected. - Switch rate (research term: pivot rate). Share of multi-turn responses where the engine switches from the originally-recommended brand to a named competitor under direct challenge. Coding via DeepSeek as a structured-extraction LLM with a fixed taxonomy locked before any extraction; pivot/hedge/defense definitions locked before B2 ran.
- What we measured. Engine response behavior under competitor challenge; source-attribution mix for negative vs positive narratives. This is descriptive measurement; it is not a causal claim about training-corpus weighting, retrieval, or human consumer behavior.
- What we did not measure. Sales, conversion, attributable revenue. Surfacing on AI is a step into a buyer's consideration set, not a purchase. ChatGPT / Claude / Gemini / Perplexity / ERNIE / Yuanbao — not in this panel. Chat-with-Search-ON browsing surfaces — separate study.
- Reliability. Cross-rep agreement on the four-class pivot/hedge/defense code was strong on DeepSeek and Qwen and noisier on Doubao — partly because Doubao's reasoning model produced longer, more discursive turn-two responses, which makes the dominant-signal coding rule less stable. Doubao long-tail consistency: top-5 κ = 1.00, top-15 κ = 0.46. We disclose both.
- Brand anonymisation. Brands anonymised for public publication; named-brand risk-surface cards delivered privately to engagement clients only.
Per-record JSONs and replication scripts available on request.