China AI visibility · Insight · Comparison
DeepSeek vs Qwen vs Doubao: three engines, three source ecosystems.
The three Chinese answer engines US and UK brands need to monitor — DeepSeek, Qwen and Doubao — do not surface the same evidence when answering Mainland-Chinese consumer questions. Below is a measured comparison across a 540-call panel: source-mix differences, cross-engine overlap, and the reliability of each finding.
The headline finding
Across our 540-call source-influence panel run in May 2026, DeepSeek, Qwen and Doubao cited Mainland-CN sources at materially different rates, drew on different secondary surfaces, and showed different long-tail stability. Cross-engine source overlap (Jaccard top-15) was 0.20–0.30 — they are not interchangeable, and a measurement framework built for one cannot be ported to another without rebuilding the source-substrate model.
For a global brand asking "which Chinese AI engine should we optimise for?", the honest answer is: all three, separately, because the work is different on each. The rest of this page lays out exactly how it differs.
Cross-engine source overlap (top-15 Jaccard)
Source share is the within-engine view. The cross-engine view asks: of the top-15 sources each engine cites, how many are shared? The answer is "not many":
| DeepSeek | Qwen | Doubao | |
|---|---|---|---|
| DeepSeek | — | 0.30 | 0.20 |
| Qwen | 0.30 | — | 0.25 |
| Doubao | 0.20 | 0.25 | — |
Top-15 source overlap (Jaccard). Lower = more divergence. Across our panel, the three engines share less than 1/3 of their top-15 sources pairwise.
A 0.20–0.30 Jaccard means roughly two-thirds of the top sources differ between any pair of engines. This is the structural reason a generic "AI visibility" audit built for ChatGPT or Gemini cannot be ported to the Chinese engines without rebuilding the source-substrate model. It is also the reason a DeepSeek-only optimisation strategy under-reports what your brand looks like on Qwen and Doubao.
Reliability — top-5 vs top-15
The previous two charts describe a single panel. Reliability asks: do the same prompts produce the same source rankings on a re-run? We re-ran the identical 30-prompt panel one week later and report multiple statistics. The headline:
Top-15 source membership stability (κ)
Cohen's κ for top-15 source-membership agreement across two consecutive re-runs of the identical 30-prompt panel. Top-5 stability was κ=1.00 for all three engines.
Three things to read out of this chart:
- Top-5 source membership is perfectly stable across all three engines (κ_top-5 = 1.00). The most-cited sources are reliable findings.
- Top-15 stability differs materially. DeepSeek (0.89) and Qwen (0.78) hold up well; Doubao (0.46) does not. This is a granular-tag normalisation issue we document explicitly in our reliability report.
- Pearson r and ICC tell the same story at the rate level. Source mention rates correlated at r 0.97–0.99 across all three LLMs (ICC(2,1) 0.97–0.99) on identical re-runs.
Practical implication: when reading any Doubao-specific source-graph recommendation we make, treat the top-5 sources as actionable and the long tail with appropriate caveat. We disclose this in every Doubao-related readout because a reliability table that reports only κ_top-5 (where everyone scores 1.00) and hides κ_top-15 is reporting selectively.
What this means for your strategy
The comparative view above leads to a small number of practical conclusions for US and UK brands considering Mainland-CN AI visibility work:
Measure all three, separately
Optimising for one engine on the assumption that the others will follow is a known failure mode. The 0.20–0.30 cross-engine Jaccard means a DeepSeek-tuned strategy will under-report what Doubao surfaces by roughly 70–80% on the source side. The Eastbound free multi-engine audit reports each engine separately for this reason.
Pick the engine your category actually surfaces on
Some categories surface decisively on one engine. Regulated categories (pharma, medical, financial services) skew Qwen because of its institutional source mix. Aspirational consumer / FMCG / beauty / travel skew Doubao because of its commerce-and-lifestyle lean. Developer-leaning B2B SaaS skews DeepSeek because of its developer-corpus weight on technical questions. We separate consumer-facing and developer-facing prompt pools in our panels because the source-mix patterns differ materially.
Read engine-specific playbooks in order
Each of the three engines has its own optimisation logic. The full playbooks:
- DeepSeek SEO visibility playbook — Western-balanced source mix; encyclopedia presence is the strongest predictor of mention rate in our 5-niche probe.
- Qwen optimization playbook — institutional / professional source bias; runs on DashScope international (NOT BytePlus).
- Doubao optimization playbook — CN-substrate-biased + commerce/lifestyle aggregator lean; runs on BytePlus ModelArk international (NOT DashScope).
For DeepSeek-only rank tracking specifically, see the DeepSeek SEO rank tracker — narrower in scope than the multi-engine audit, faster to run.
Methodology note
The numbers on this page come from our 540-call source-influence panel (30 prompts × 3 LLMs × 3 reps × 2 turns) plus the matched re-run for reliability. Engines were queried via their live API endpoints — DeepSeek (deepseek-chat), Qwen on DashScope international (qwen-plus), Doubao on BytePlus ModelArk international. Model IDs were logged at session start and end; we cannot guarantee identical model snapshots across runs because neither endpoint exposes pinned-version handles.
Source attributions are normalised to canonical platform IDs (e.g., "Xiaohongshu" rather than a specific post). The Jaccard overlap is computed on top-15 source sets per engine; κ_top-5 and κ_top-15 are Cohen's κ for source membership agreement across the two consecutive re-runs.
For the full methodology, see how Eastbound measures China AI visibility.
Run the audit on your URL
The free Eastbound audit reports DeepSeek + Qwen + Doubao on a stratified zh-CN consumer prompt panel and surfaces per-engine selection / absorption / mention scores plus the highest-leverage fixes for your specific URL.
Or read the China AI visibility pillar, the agency services, or our research.