Recently, Meta announced the establishment of a new Superintelligence Lab (Meta Superintelligence Labs, MSL) and published its first important paper, whose research significantly improved the reasoning speed of large language models in Retrieval-Augmented Generation (RAG) tasks, with an improvement of more than 30 times.
The paper is titled "REFRAG: Rethinking RAG based Decoding," focusing on how to enable large language models to quickly extract key information when performing RAG tasks, reducing computational load and response time while maintaining accuracy. The establishment of Meta Superintelligence Labs marks the company's further efforts in the field of artificial intelligence, especially in the current competitive environment, where Zuckerberg urgently needs to drive AI technology development.

Meta Superintelligence Lab was officially established this June, located in Menlo Park, California, aiming to develop superintelligence technologies. According to reports, Zuckerberg was dissatisfied with the performance of Meta's latest released Llama4 model in April and even asked employees to work overtime to improve it. This prompted him to establish this new lab and bring in a large number of top talents, including Alexandr Wang, the founder of Scale AI.
Inside the laboratory, the team is divided into four groups, responsible for the development of large language models, AI fundamental research, product technology implementation, and infrastructure support. The proposal of the REFRAG framework is the first step of the lab in optimizing the performance of large language models.
The core idea of the REFRAG framework is to compress long context content into summaries using a lightweight model, reducing the input information processed by the decoder. This method not only accelerates processing speed but also reduces computational load and improves model efficiency. In addition, the research team also adopted a "continuous pre-training" approach, training the model through reconstruction tasks to retain important details as much as possible while compressing information.
After comprehensive testing, REFRAG performed well in various tasks, especially in reducing time delay and increasing throughput. Experimental results show that REFRAG can outperform the previous state-of-the-art model CEPE in speed when the compression ratio is 16 times, with almost no loss in accuracy.
This innovative achievement has injected new momentum into Meta's development in the field of artificial intelligence and also demonstrates its forward-thinking approach in improving the reasoning efficiency of large models.
Paper: https://arxiv.org/abs/2509.01092
Key Points:
🌟 Meta established a Superintelligence Lab to promote the development of AI technology.
⚡ New paper "REFRAG" achieves a 30-fold improvement in RAG reasoning speed, reducing computational load.
🚀 REFRAG framework enhances the efficiency and accuracy of large language models through information compression.
