Models with strong MATH / GSM8K scores. In practice these overlap heavily with Reasoning models — Chinese math specialists often ship as a reasoning-mode variant of a general model.
| Model | Creator | Context | Open weight | Tags |
|---|---|---|---|---|
| Qwen 2.5 Max | Alibaba Cloud | 33K | — | chat · reasoning · code |
| DeepSeek V3.1 | DeepSeek | 33K | Yes | chat · reasoning · code |
| R1 | DeepSeek | 64K | Yes | reasoning · math · coding |