Baidu AI Team Launches PaddleOCR 3.1 Version with Enhanced Capabilities Supporting MCP

AIbase基地

Published in AI News · 4 minute read · Jul 8, 2025

On July 7, the Baidu AI team announced the official release of PaddleOCR 3.1, which achieved three major upgrades in multilingual recognition, complex document translation, and large model connectivity. The new version supports text recognition in 37 languages, with an average accuracy improvement of over 30%. It also introduces a document translation pipeline and MCP server functionality to help developers efficiently build AI applications.

Addressing multilingual needs in global scenarios, PaddleOCR 3.1 adds the PP-OCRv5 multilingual model, covering 37 languages such as French, Spanish, and Russian. By integrating the visual and text understanding capabilities of the ERNIE 4.5 multimodal large model, the model can automatically complete high-confidence text detection and data annotation, solving the problem of scarce multilingual data. Test data shows that the new model improves recognition accuracy by more than 30% in Latin and East Slavic language scenarios. For example, the error rate for Korean recognition dropped from 8.7% to 2.1%, and the parsing speed for complex Russian layout documents increased by two times.

WeChat Screenshot_20250708091203.png

Combined with the PP-StructureV3 document parsing engine and the ERNIE large model, PaddleOCR 3.1 introduces the PP-DocTranslation translation pipeline. This tool can intelligently recognize complex elements such as tables, formulas, and handwritten text in PDFs and images, and convert them into Markdown format for multilingual translation. For professional fields such as law and medicine, the system allows users to upload terminology comparison tables to achieve precise translation of "key vocabulary." For example, after using this feature, a multinational pharmaceutical company improved the efficiency of drug instruction translations by 40%, achieving 99.2% consistency in professional terminology.

To lower the barriers to AI application development, PaddleOCR 3.1 introduces the MCP (Model Context Protocol) server function, which supports seamlessly integrating OCR capabilities into downstream applications through a standardized protocol. Developers can quickly set up an MCP service with just a few steps, and access core functions such as image text recognition and document layout analysis through local Python libraries, the PaddlePaddle Starry Sky Community, or self-hosted services.

Open Source Address:https://github.com/PaddlePaddle/PaddleOCR

Xbox Executive's Suggestion to Use AI to Deal with Layoff Emotions Sparks Controversy

Microsoft announced the global layoff of 9,000 employees. Xbox executive Matt Turnbull suggested that laid-off employees use AI tools like ChatGPT to cope with their emotions, which sparked controversy. He shared AI prompt templates to help with career planning, but the suggestion was criticized as distasteful. Netizens believe that AI cannot replace the emotional trauma caused by layoffs. This round of layoffs affects 4% of Microsoft's employees, and the gaming department may be hit the hardest. The incident reflects broader societal discussions on employee mental health support and the boundaries of AI application in the context of the current trend of layoffs in tech companies.

Grok4 to be released: Musk confirms X platform live stream on Wednesday night

Elon Musk announced that xAI's new generation large model Grok4 will be released at 8 PM (11 PM Beijing Time on Thursday) this Wednesday, and the launch will be live-streamed on the X platform. Musk previously revealed that Grok has seen significant improvements, and this release will showcase xAI's latest breakthroughs in the AI field.

Google Open Sources MCP Toolbox for Databases: Unlock the Infinite Possibilities of AI and Databases with 10 Lines of Code

Google releases the open-source tool MCP Toolbox for Databases, simplifying the integration of AI agents with SQL databases. The tool connects to a database with just 10 lines of code and supports secure mechanisms such as connection pool management, authentication, and schema introspection. It is compatible with various Google Cloud databases. As an open-source project, it lowers the development barrier, but currently mainly supports Google ecosystem databases. Future expansion of compatibility may be needed. This tool has the potential to become a standard component for AI development, driving intelligent data processing.

Baidu AI Team Launches PaddleOCR 3.1 Version with Enhanced Capabilities Supporting MCP

Related AI News

Baidu's Stock Rises, Intelligent Cloud Wins Double Champion in Large Model Market in the First Half of the Year

Microsoft Win11 is about to launch the AI Dynamic Wallpaper feature, preview code has appeared

Massive Transaction! CoreWeave Acquires Data Center Giant Core Scientific for $9 Billion

Xbox Executive's Suggestion to Use AI to Deal with Layoff Emotions Sparks Controversy

Grok4 to be released: Musk confirms X platform live stream on Wednesday night

Google Open Sources MCP Toolbox for Databases: Unlock the Infinite Possibilities of AI and Databases with 10 Lines of Code