Models that keep accuracy at 128K+ tokens — Kimi, GLM-Long, Qwen2.5-1M. Useful for long-document Q&A, whole-repo code review, or multi-hour transcripts where chunking would lose the through-line.
| Model | Creator | Context | Open weight | Tags |
|---|---|---|---|---|
| Qwen3 72B Instruct | Alibaba | 131K | Yes | chat · reasoning · multilingual |
| DeepSeek V3 | DeepSeek AI | 66K | Yes | reasoning · coding · tool_use |
| Llama 3.3 70B Instruct | Meta | 131K | Yes | chat · coding · reasoning |