In the open-source large model arena, European star Mistral AI once again demonstrated its remarkable rate of evolution.

On March 16 local time, Mistral AI officially released Mistral Small4. This is the lab's first true "versatile" large model, which perfectly combines flagship-level reasoning, multimodal understanding, and powerful programming capabilities in a single model for the first time. For developers, this means no longer having to make "choices" between various vertical models, as the new Small4 achieves "I want it all."

image.png

Mistral Small4 adopts an advanced MoE (Mixture of Experts) architecture:

  • Core parameters: Total parameter count is 119B, with only 6B activated parameters, significantly optimizing operational efficiency while maintaining performance.

  • Extended context: It has an extended context window of 256k, allowing it to easily handle entire technical documents or large codebases.

  • Flexible modes: Supports both fast response and deep reasoning modes, and is officially open-sourced under the Apache 2.0 license, showing great sincerity.

In terms of performance, Mistral Small4 has made a qualitative leap compared to its predecessor. Official data shows that under the latency-optimized mode, its end-to-end completion time is reduced by 40%; and under the throughput-optimized mode, it can process three times as many requests per second as Small3. In cross-comparisons with external large models, its performance in three core benchmark tests is no less than OpenAI's GPT-OSS120B.

Deployment requirements and hardware recommendations:

To unlock the full potential of this model, Mistral AI provides clear hardware guidance. The minimum configuration requirement is 4× HGX H100 or 1× DGX B200; for an optimal experience, the official recommends using a combination of 4× HGX H200 or 2× DGX B200.

With the release of Mistral Small4, Mistral AI