Models tuned for programming — code completion in editors (Cursor, Cline), terminal agents (OpenCode, Aider), long-horizon refactors, and agentic tool use. Look for strong HumanEval / SWE-bench scores plus a long context window.
| Model | Creator | Context | Open weight | Tags |
|---|---|---|---|---|
| Qwen 2.5 Max | Alibaba Cloud | 33K | — | chat · reasoning · code |
| Qwen3 72B Instruct | Alibaba | 131K | Yes | chat · reasoning · multilingual |
| Doubao Pro 32K | ByteDance | 33K | — | chat · reasoning · code |
| DeepSeek Chat V3.1 | DeepSeek | 131K | Yes | chat · reasoning · code |
| DeepSeek R1 | DeepSeek AI | 66K | Yes | reasoning · math · coding |
| DeepSeek V3 | DeepSeek AI | 66K | Yes | reasoning · coding · tool_use |
| Llama 3.1 405B Instruct | Meta | 131K | Yes | chat · reasoning · code |
| Llama 3.3 70B Instruct | Meta | 131K | Yes | chat · coding · reasoning |
| GLM-4-Air | Zhipu AI | 131K | — | chat · reasoning · code |