You're offline - Playing from downloaded podcasts
Back to All Episodes
Podcast Episode

Mistral Launches Ultra-Fast Open Source Translation Models That Run on Your Phone

February 6, 2026

Audio archived. Episodes older than 60 days are removed to save server storage. Story details remain below.

French AI startup Mistral has released Voxtral Transcribe 2, a family of speech-to-text models that can transcribe and translate across 13 languages with sub-200 millisecond latency. At just four billion parameters, the models are small enough to run on phones and laptops, marking a significant challenge to larger competitors like Google and OpenAI.

Mistral Takes on AI Giants with Lightning-Fast Translation Models

French AI startup Mistral has released Voxtral Transcribe 2, a new family of speech-to-text models that the company claims will pave the way for seamless real-time conversation between people speaking different languages.

The release includes two models: Voxtral Mini Transcribe V2 for batch processing and Voxtral Realtime for live applications. Both support translation across thirteen languages including English, Chinese, Hindi, Spanish, Arabic, French, Portuguese, Russian, German, Japanese, Korean, Italian, and Dutch.

Small Models, Big Performance

At just four billion parameters, the models are compact enough to run locally on phones and laptops, a claimed first in the speech-to-text field. Voxtral Realtime offers configurable latency down to sub-200 milliseconds, making it roughly ten times faster than Google's latest translation model, which operates at a two-second delay.

On the FLEURS multilingual speech benchmark, the models achieve approximately four percent word error rate, competitive with or superior to alternatives from OpenAI and Google. Voxtral Mini Transcribe V2 processes audio roughly three times faster than ElevenLabs' Scribe v2 at one-fifth the cost.

The Efficiency Philosophy

Pierre Stock, Mistral's Vice President of Science Operations, told WIRED that the models are laying the groundwork for fully seamless real-time speech-to-speech translation, predicting the problem would be solved in 2026. Stock also offered a pointed critique of the big-spending approach favoured by American AI labs: "Frankly, too many GPUs makes you lazy. You just blindly test a lot of things, but you don't think what's the shortest path to success."

Open Source and Privacy First

Voxtral Realtime is released under the Apache 2.0 open-weights licence, allowing developers to deploy the model freely. Because the models run on-device, private conversations stay local rather than being uploaded to cloud servers. The models support GDPR and HIPAA-compliant deployments, positioning Mistral as a European alternative to proprietary American AI systems.

Pricing starts at just 0.3 cents per minute for batch transcription and 0.6 cents per minute for real-time processing.

Published February 6, 2026 at 1:52pm

More Recent Episodes