AgiBot · 智元机器人
Embodied / robotics model — runs on robot hardware, not a token API. No per-1M pricing. For API-priced LLMs see the model catalog.
| Creator | AgiBot (智元机器人) |
| Architecture | Vision-Language-Latent-Action (ViLLA) |
| Embodiment | Generalist (cross-embodiment: arms, humanoid) |
| Released | 2025-03-10 |
GO-1 is AgiBot's generalist embodied foundation model, notable for pioneering the ViLLA (Vision-Language-Latent-Action) architecture — an evolution of standard VLA. It combines a Vision-Language Model, a Latent Planner trained on cross-embodiment and human-operation data, and an Action Expert (MoE), with latent action tokens quantized via VQ-VAE. A lighter GO-1-Air variant is also published. Open weights are on Hugging Face.
Cross-embodiment manipulation research; teams adapting a generalist base to their own robot arms or humanoid platforms.