Your brand here — Reach our audience of professional directory owners and boost your sales.

Chinese Embodied AI & Robotics Foundation Models

Vision-Language-Action (VLA) and embodied foundation models from Chinese labs and robotics companies. These run on robots, not behind a token API — so this is a research-and-capability map, not a pricing comparison.

Not an API catalog. Unlike the LLM API models, these are open-weight or hardware-bound research models. You download weights and run them on robot hardware — there is no per-token pricing. Every spec below links to its primary source.

Vision-Language-Latent-Action ViLLA

AgiBot GO-1
AgiBot · 智元机器人
First generalist embodied foundation model on the ViLLA architecture.

Vision-Language-Action VLA

Unitree UnifoLM-VLACC BY-NC-SA 4.0
Unitree Robotics · 宇树科技
Open VLA from the humanoid-hardware leader, tuned for embodied control.
InternVLA-M1
Shanghai AI Lab (InternRobotics) · 上海人工智能实验室
VLA from the InternVL team's embodied research line.
Galbot GraspVLA / TrackVLA
Galbot · 银河通用
Narrow-remit VLAs (grasping, tracking, retail) behind the G1 robot.
ByteDance GR-Dexter
ByteDance Seed · 字节跳动 Seed
Hardware-model-data framework for high-DoF dexterous manipulation.
Tencent HY-Embodied
Tencent Hunyuan · 腾讯混元
Tencent Hunyuan's entry into embodied foundation models.

Diffusion Transformer Diffusion

RDT-1BMIT
Tsinghua University (TSAIL) · 清华大学
1.2B diffusion foundation model for bimanual manipulation, MIT-licensed.