Tsinghua University and Others Release UltraRAG 2.1! The World's First Multimodal RAG Framework Based on MCP Architecture, Build an Intelligent Retrieval System with a YAML File

Retrieval-Augmented Generation (RAG) technology has made significant breakthroughs. Developed jointly by the THUNLP Lab at Tsinghua University, the NEUIR Lab at Northeastern University, OpenBMB, and AI9Stars, UltraRAG2.1 has been officially released, becoming the world's first open-source RAG framework based on the Model Context Protocol (MCP) architecture. This version greatly simplifies the construction process of multimodal intelligent retrieval systems—researchers can build multi-stage reasoning, generation, and evaluation with just a few lines of YAML configuration files, without writing a single line of code, significantly lowering the technical barrier.

Three Core Upgrades Define the Next Generation RAG Standard

Native Multimodal Support, Closing the Text-Image Retrieval Loop

UltraRAG2.1 includes an integrated Retriever-Generation-Evaluation pipeline that not only supports text but also handles multimodal data such as images and PDFs. Its innovative VisRAG Pipeline can directly parse local PDF documents, automatically extract text and charts, build cross-modal indexes, and enable "image-to-text" and "text-to-image" hybrid retrieval, suitable for high-value scenarios such as scientific paper analysis and technical manual Q&A.

Automatic Knowledge Base Construction, Deep Integration with MinerU

The framework supports smart parsing and semantic chunking of multiple formats, including Word, PDF, and Markdown, and seamlessly integrates the open-source document processing tool MinerU to build enterprise-grade private knowledge bases in one click. Users do not need to manually clean or annotate data; the system automatically completes structured processing, significantly increasing knowledge management efficiency.

Unified Workflow + Standardized Evaluation, Results are Explainable and Optimizable

UltraRAG2.1 provides a full-chain visual RAG workflow, compatible with various retrieval engines (such as Elasticsearch, FAISS) and generation models (Llama, Qwen, Kimi, etc.), and introduces a standardized evaluation system to quantify result quality from dimensions such as relevance, fidelity, and fluency. Developers can intuitively identify bottlenecks and quickly iterate and optimize.

MCP Architecture: Making RAG Truly "Composable and Scalable"

Differing from traditional RAG's hard-coded approach, UltraRAG2.1 is based on the Model Context Protocol (MCP), decoupling modules such as retrieval, reasoning, and generation into standardized "intelligent agents." With just a few lines of YAML declarative configuration, complex task flows can be flexibly assembled. For example, just a few lines of configuration can implement a three-stage workflow: "first retrieve technical documents → then call a code generation model → finally use an evaluation module to verify the output."

AIbase believes that the release of UltraRAG2.1 marks a shift in RAG technology from "tool assembly" to "engineering paradigm." When multimodal understanding, knowledge construction, and performance evaluation are unified within a lightweight, open-source, low-code framework, enterprises and researchers will be able to more efficiently apply large model capabilities to real-world business scenarios. This technological innovation led by the Chinese community is injecting new momentum into the global RAG ecosystem.

Tsinghua University and Others Release UltraRAG 2.1! The World's First Multimodal RAG Framework Based on MCP Architecture, Build an Intelligent Retrieval System with a YAML File

Related Recommendations

TaiXu-Admin V0.0.10 Release Supports Compatibility with Ollama Models

Lima v2.0 Launches with Great Impact: Evolving from a Container Tool into an Invisible Shield for Secure AI Workflows

Google Launches Managed MCP Server: One-Click Access to BigQuery and Maps with Zero Configuration for Agent Calls

New York Times Sues Perplexity Officially: Nearly 180,000 Crawls, RAG Output Accused of Nearly Word-for-Word Copying

MCP Safe Track Adds a New Unicorn: Runlayer Launches with $11 Million Seed Funding