Chinese LLM leaderboards

Composite "best for X" tables ranked from live catalog data, plus per-benchmark top 10s at the bottom. Only official and auditable third-party benchmark scores — no in-house evals.

Cheapest Chinese chat models

Lowest blended price / 1M tokens for models tagged chat or multilingual. Picked the cheapest hosting per model.

#ModelCheapest / 1MViaContext
1Doubao Lite 32K
ByteDance
$0.05ByteDance Doubao (Volcengine)33K
2GLM-4-Air
Zhipu AI
$0.07Zhipu AI131K
3Yi Lightning
01.AI
$0.1401.AI16K
4Doubao Pro 32K
ByteDance
$0.15ByteDance Doubao (Volcengine)33K
5DeepSeek V3
DeepSeek AI · open-weight
$0.48DeepSeek66K
6DeepSeek Chat V3.1
DeepSeek · open-weight
$0.48DeepSeek131K
7Llama 3.3 70B Instruct
Meta · open-weight
$0.88Together.ai131K
8Qwen3 72B Instruct
Alibaba · open-weight
$0.90Together.ai131K
9Qwen 2.5 Max
Alibaba Cloud
$2.80Alibaba Cloud DashScope33K
10Llama 3.1 405B Instruct
Meta · open-weight
$3.00Fireworks.ai131K

Cheapest Chinese reasoning models

Lowest blended price for reasoning / chain-of-thought / tool-use models. Same pricing methodology as the chat board.

#ModelCheapest / 1MViaContext
1GLM-4-Air
Zhipu AI
$0.07Zhipu AI131K
2Yi Lightning
01.AI
$0.1401.AI16K
3Doubao Pro 32K
ByteDance
$0.15ByteDance Doubao (Volcengine)33K
4DeepSeek V3
DeepSeek AI · open-weight
$0.48DeepSeek66K
5DeepSeek Chat V3.1
DeepSeek · open-weight
$0.48DeepSeek131K
6Llama 3.3 70B Instruct
Meta · open-weight
$0.88Together.ai131K
7Qwen3 72B Instruct
Alibaba · open-weight
$0.90Together.ai131K
8DeepSeek R1
DeepSeek AI · open-weight
$0.96DeepSeek66K
9Qwen 2.5 Max
Alibaba Cloud
$2.80Alibaba Cloud DashScope33K
10Llama 3.1 405B Instruct
Meta · open-weight
$3.00Fireworks.ai131K

Longest context windows

Chat / reasoning / multimodal models ranked by context window. For document Q&A, whole-repo code review, or long transcripts. Embedding / image-gen / video types excluded — their 'context' is a different quantity.

#ModelCheapest / 1MViaContext
1Llama 3.3 70B Instruct
Meta · open-weight
$0.88Together.ai131K
2Qwen3 72B Instruct
Alibaba · open-weight
$0.90Together.ai131K
3DeepSeek Chat V3.1
DeepSeek · open-weight
$0.48DeepSeek131K
4GLM-4-Air
Zhipu AI
$0.07Zhipu AI131K
5Llama 3.1 405B Instruct
Meta · open-weight
$3.00Fireworks.ai131K
6DeepSeek V3
DeepSeek AI · open-weight
$0.48DeepSeek66K
7DeepSeek R1
DeepSeek AI · open-weight
$0.96DeepSeek66K
8Qwen 2.5 Max
Alibaba Cloud
$2.80Alibaba Cloud DashScope33K
9Doubao Pro 32K
ByteDance
$0.15ByteDance Doubao (Volcengine)33K
10Doubao Lite 32K
ByteDance
$0.05ByteDance Doubao (Volcengine)33K

Open-weight Chinese models

Models that ship weights you can self-host. Ordered by context window — biggest first.

#ModelCheapest / 1MViaContext
1Llama 3.3 70B Instruct
Meta · open-weight
$0.88Together.ai131K
2Qwen3 72B Instruct
Alibaba · open-weight
$0.90Together.ai131K
3DeepSeek Chat V3.1
DeepSeek · open-weight
$0.48DeepSeek131K
4Llama 3.1 405B Instruct
Meta · open-weight
$3.00Fireworks.ai131K
5DeepSeek V3
DeepSeek AI · open-weight
$0.48DeepSeek66K
6DeepSeek R1
DeepSeek AI · open-weight
$0.96DeepSeek66K

Best Chinese coding models

Filter: `code` capability. Ranked by top HumanEval / LiveCodeBench / SWE-bench score when available — otherwise the model is hidden to avoid a misleading placeholder.

#ModelCheapest / 1MViaContextScore
1DeepSeek R1
DeepSeek AI · open-weight
$0.96DeepSeek66K89.1%
HumanEval
2Llama 3.3 70B Instruct
Meta · open-weight
$0.88Together.ai131K88.4%
HumanEval
3DeepSeek V3
DeepSeek AI · open-weight
$0.48DeepSeek66K82.5%
HumanEval

China-only hosted models

Models whose every published hosting sits on a Provider without an overseas node. For CN-mainland users who don't want cross-border egress.

#ModelCheapest / 1MViaContext
1Doubao Lite 32K
ByteDance
$0.05ByteDance Doubao (Volcengine)33K
2GLM-4-Air
Zhipu AI
$0.07Zhipu AI131K
3Doubao Pro 32K
ByteDance
$0.15ByteDance Doubao (Volcengine)33K
4DeepSeek Chat V3.1
DeepSeek · open-weight
$0.48DeepSeek131K

Per-benchmark top 10

Existing published benchmark scores — sourced from papers or auditable third-party boards. Click any model to see its full profile.

HumanEval coding

Higher = better
#ModelScoreConditionsSource
1DeepSeek R1
DeepSeek AI
0.890-shot pass@1official
2Llama 3.3 70B Instruct
Meta
0.880-shot pass@1official
3DeepSeek V3
DeepSeek AI
0.820-shot pass@1official
4Qwen3 72B Instruct
Alibaba
0.820-shot pass@1official

MMLU knowledge

Higher = better
#ModelScoreConditionsSource
1DeepSeek R1
DeepSeek AI
0.905-shotofficial
2DeepSeek V3
DeepSeek AI
0.895-shotofficial
3Llama 3.3 70B Instruct
Meta
0.875-shotofficial
4Qwen3 72B Instruct
Alibaba
0.865-shotofficial