qwen3-vl by Alibaba Cloud

7 tracked versions of the qwen3-vl family. Timeline ordered newest → oldest. Flagship version marked.

Version timeline

  1. Qwen3 VL 235B A22B InstructFlagshipOpen-weight
    unknown
    Version
    Context
    262K
    Parameters
    License

    Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...

  2. unknown
    Version
    Context
    131K
    Parameters
    License

    Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STEM and math....

  3. unknown
    Version
    Context
    131K
    Parameters
    License

    Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...

  4. unknown
    Version
    Context
    131K
    Parameters
    License

    Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels...

  5. unknown
    Version
    Context
    131K
    Parameters
    License

    Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...

  6. unknown
    Version
    Context
    131K
    Parameters
    License

    Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

  7. unknown
    Version
    Context
    131K
    Parameters
    License

    Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...

About Alibaba Cloud

See every provider hosting Alibaba Cloud models at /provider/alibaba.