Models tuned for programming — code completion in editors (Cursor, Cline), terminal agents (OpenCode, Aider), long-horizon refactors, and agentic tool use. Look for strong HumanEval / SWE-bench scores plus a long context window.
| Model | Creator | Context | Open weight | Tags |
|---|---|---|---|---|
| Qwen 2.5 Max | Alibaba Cloud | 33K | — | chat · reasoning · code |
| Qwen3.5-122B-A10B | Alibaba Cloud | 262K | Yes | chat · vision · tool_calling |
| Qwen3.5 397B A17B | Alibaba Cloud | 262K | Yes | chat · reasoning · code |
| Qwen3 72B Instruct | Alibaba | 131K | Yes | chat · reasoning · multilingual |
| Doubao Pro 32K | ByteDance | 33K | — | chat · reasoning · code |
| DeepSeek V3.1 | DeepSeek | 33K | Yes | chat · reasoning · code |
| R1 | DeepSeek | 64K | Yes | reasoning · math · coding |
| DeepSeek V3 | DeepSeek AI | 66K | Yes | reasoning · coding · tool_use |
| DeepSeek V4 Flash | DeepSeek | 1049K | Yes | chat · code |
| DeepSeek V4 Pro | DeepSeek | 1049K | Yes | chat · reasoning · code |
| Llama 3.1 405B Instruct | Meta | 131K | Yes | chat · reasoning · code |
| Llama 3.3 70B Instruct | Meta | 131K | Yes | chat · coding · reasoning |
| Kimi K2 0711 | Moonshot AI | 131K | Yes | text generation · instruction following · agentic tasks |
| Kimi K2.5 | Moonshot AI | 262K | Yes | chat · reasoning · code |
| Kimi K2.6 | Moonshot AI | 262K | Yes | chat · reasoning · code |
| Step 3.5 Flash | StepFun (阶跃星辰) | 262K | Yes | chat · reasoning · code |
| GLM-4-Air | Zhipu AI | 131K | — | chat · reasoning · code |
| GLM 5 | Zhipu AI | 203K | Yes | reasoning · coding · tool_calling |