Mistral AI, a French artificial intelligence startup, officially launched its latest document content recognition model, OCR 4, on the 23rd of this month, making a significant impact in the field of optical character recognition. This compact and specialized model supports up to 170 languages across 10 language families, achieving a score of 93.07 in the authoritative benchmark test OmniDocBench, with output quality preferred by human evaluators over competitors such as GPT 5.5 Pro and Gemini 3.1 Pro Preview.

Compact yet comprehensive, covering multiple downstream tasks
OCR 4 is not focused on parameter size but is an efficient model specialized in document recognition. It provides not only recognized text output but also border positioning, area classification, and confidence scoring, which can support various downstream workloads such as RAG semantic chunking, intelligent agents' basic structured units, and connector structured content.
In terms of pricing, the basic API call for OCR 4 costs $4 per thousand pages, and a 50% discount is available when using batch processing; while the pricing for document AI is $5 per thousand pages.
Mistral AI, one of the most representative startups in the European AI field, has previously gained prominence in the global market with its dual strategy of open-source and closed-source models. The launch of OCR 4 further extends the company's capabilities from general large language models to the vertical sector of document intelligent processing, directly competing with giants like OpenAI and Google at the foundational tool level.
