Highlighting Ultra-Low Latency! Mistral Launches a New Speech-to-Text AI Model

French AI leader Mistral AI has officially launched two new speech-to-text models, aiming to redefine industry standards for transcription speed, privacy protection, and cost-effectiveness.

The newly released models include Voxtral Mini Transcribe V2 and Voxtral Realtime, both part of the Voxtral Transcribe2 system. These models offer top-tier transcription quality, speaker identification (Diarization), and extremely low latency, suitable for a variety of business scenarios such as virtual assistants, call center automation, and compliance recording.

Key Product Highlights:

Voxtral Realtime (Real-time Processing): Designed specifically for live audio, it uses an innovative streaming architecture. Its delay can be configured as low as 200 milliseconds. At a 480-millisecond delay, the error rate is only 1%-2%, almost equivalent to the accuracy of offline transcription. This model has only 4 billion parameters and supports running on local devices such as smartphones or laptops, greatly ensuring privacy security. It is now open-source on the Hugging Face platform under the Apache 2.0 license, with an API price of $0.006 per minute.
Voxtral Mini Transcribe2 (Batch Processing): Specifically designed for pre-recorded files. It supports single requests of up to 3 hours and offers accurate speaker labeling and timestamps. It performs well in the FLEURS word error rate benchmark test, and its API price is only $0.003 per minute, which Mistral AI calls the most cost-effective transcription solution in the current market.

Both models natively support 13 languages, including Chinese, English, French, and Japanese. Users can currently experience them on Mistral AI's Audio Playground or Le Chat assistant.

Key Points:

🚀 Outstanding Performance: The real-time model has a delay as low as 200ms, while the offline model has a significant advantage in word error rate (WER).
🔒 Local Deployment: A lightweight design with 4B parameters supports running on local devices without uploading to the cloud, ensuring privacy security.
💰 High Cost-Effectiveness: The batch transcription API is as low as $0.003 per minute, striving to establish a pricing advantage in the enterprise market.
🌍 Multi-language Support: Natively supports 13 major languages worldwide, covering most commercial application scenarios.

Samsung Plans to Invest 1 Billion Euros in Mistral: The Confidence for Europe's AI Independence Is Being Revalued

Samsung plans to increase its investment in French AI firm Mistral AI, following a prior investment by its venture capital arm. If this round proceeds, it will deepen their ties. The company aims to launch a Series D funding at a €20 billion valuation, and the investment could reshape Europe’s AI landscape.....

Report: Zhiyuan Robotics Said to Be Striving for IPO with a Target Valuation of $20 Billion

Zhiyuan Robot, valued at ~$20B, is advancing its IPO with CITIC Securities as sponsor; projected 2026 revenue: RMB 4B. At WAIC 2026, it unveiled five new robots—Yuanzheng A3Ultra, Jingling G2Max, Lingxi X2EDU, Linjiedian dexterous hand, and Kutuo riding robot—embodying the "Three Intelligences in One" framework.....

Shenzhen Science Multimodal Foundation Model Makes Debut in Shanghai: 11 Billion Parameters Integrate Six Types of Scientific Data, One Model Understands DNA to Weather Fields

Shanghai Academy of AI for Science unveiled 'Shenzhen', a multimodal foundation model, at WAIC 2026. Named after Journey to the West, it serves as a compact, open super brain for multidisciplinary research, enabling diverse scientific tasks. It invites researcher validation and co-construction, and powers the previously launched 'Dasheng' scientific agent.....

Highlighting Ultra-Low Latency! Mistral Launches a New Speech-to-Text AI Model

Key Points:

Related Recommendations

Aliyun Open Sources 0.8B Document Parsing Model OvisOCR2, Ends-to-End Solution Tops OmniDocBench

Samsung Plans to Invest 1 Billion Euros in Mistral: The Confidence for Europe's AI Independence Is Being Revalued

Report: Zhiyuan Robotics Said to Be Striving for IPO with a Target Valuation of $20 Billion

Tencent Hyra-1.0 Launches Research Intelligent Agent, Unifying AI Development and Scientific Discovery in a Single Framework

Shenzhen Science Multimodal Foundation Model Makes Debut in Shanghai: 11 Billion Parameters Integrate Six Types of Scientific Data, One Model Understands DNA to Weather Fields