Shanghai AI Lab (InternRobotics) · 上海人工智能实验室
Embodied / robotics model — runs on robot hardware, not a token API. No per-1M pricing. For API-priced LLMs see the model catalog.
| Creator | Shanghai AI Lab (InternRobotics) (上海人工智能实验室) |
| Architecture | Vision-Language-Action (VLA) |
| Embodiment | Manipulation |
InternVLA-M1 is a vision-language-action model from Shanghai AI Lab's InternRobotics group — the same lab behind the widely-used InternVL multimodal models. It is one of several embodied checkpoints the lab publishes (alongside InternVLA-A1-3B and the InternVLA-N1 system), extending the InternVL multimodal foundation toward downstream robot manipulation. Open weights on Hugging Face.
Manipulation policy research building on the InternVL multimodal stack; academic embodied-AI benchmarking.