Models with strong MATH / GSM8K scores. In practice these overlap heavily with Reasoning models — Chinese math specialists often ship as a reasoning-mode variant of a general model.
| Model | Creator | Context | Open weight | Tags |
|---|---|---|---|---|
| Qwen 2.5 Max | Alibaba Cloud | 33K | — | chat · reasoning · code |
| DeepSeek Chat V3.1 | DeepSeek | 131K | Yes | chat · reasoning · code |
| DeepSeek R1 | DeepSeek AI | 66K | Yes | reasoning · math · coding |