Models that keep accuracy at 128K+ tokens — Kimi, GLM-Long, Qwen2.5-1M. Useful for long-document Q&A, whole-repo code review, or multi-hour transcripts where chunking would lose the through-line.
| Model | Creator | Context | Open weight | Tags |
|---|---|---|---|---|
| Qwen3 72B Instruct | Alibaba | 131K | Yes | chat · reasoning · multilingual |
| DeepSeek V3 | DeepSeek AI | 66K | Yes | reasoning · coding · tool_use |
| Llama 3.3 70B Instruct | Meta | 131K | Yes | chat · coding · reasoning |
| MiniMax M2.7 | MiniMax | 197K | Yes | chat · long_context |
| Kimi K2.5 | Moonshot AI | 262K | Yes | chat · reasoning · code |
| Kimi K2.6 | Moonshot AI | 262K | Yes | chat · reasoning · code |
| GLM 5 | Zhipu AI | 203K | Yes | reasoning · coding · tool_calling |
| GLM 5.1 | Zhipu AI | 203K | Yes | chat · reasoning · long_context |