Google DeepMind Launches MoR Architecture: Expected to Significantly Improve the Efficiency of Large Language Models

In the field of artificial intelligence, large language models (LLMs) have received widespread attention due to their outstanding performance, but they face significant computational and memory overhead when deployed. To overcome this challenge, Google DeepMind recently introduced a new architecture —— Mixture-of-Recursions (MoR), which is considered to have the potential to be the "killer" of traditional Transformer models.

The MoR architecture innovates on recursive Transformers, aiming to achieve parameter sharing and adaptive computation simultaneously. By integrating dynamic token-level routing into an efficient recursive Transformer, MoR can provide performance comparable to large models without increasing model costs. The model assigns a dedicated recursive depth for each token through a lightweight routing system, dynamically determining how many "thoughts" each token needs. This approach effectively allocates computing resources and improves processing efficiency.

In terms of specific implementation, MoR adopts an advanced caching mechanism that selectively caches and retrieves relevant key-value pairs based on the recursive depth of tokens. This innovation significantly reduces memory bandwidth pressure and improves inference throughput. At the same time, MoR also reduces the number of parameters and lowers computational costs through multiple optimization measures such as parameter sharing, computational routing, and recursive-level caching.

In experiments, MoR surpassed the original Transformer and recursive Transformer with fewer parameters under the same computational budget, verifying its superior performance. By comparing with baseline models, MoR also achieved better results in average accuracy for few-shot learning, despite reducing the number of parameters by nearly 50%. This success is attributed to its efficient computing strategy, allowing MoR to process more training tokens.

Additionally, researchers found that MoR consistently outperformed the recursive baseline model under different computational budgets. Especially when the model size exceeds 360M, MoR not only matches the original Transformer but often surpasses the opponent at low to medium budgets. Therefore, MoR is considered a scalable and efficient alternative suitable for large-scale pre-training and deployment.

With the continuous development of AI technology, the introduction of the MoR architecture provides a new solution for the efficiency of large language models, indicating a new breakthrough in the field of AI research.

Paper link: alphaxiv.org/abs/2507.10524

Key points:
🌟 MoR architecture effectively improves the efficiency of large language models by dynamically allocating computing resources and a caching mechanism.
📉 Under the same computational budget, MoR surpasses the traditional Transformer with fewer parameters and better performance.
🚀 MoR is considered a new breakthrough in AI research, suitable for large-scale pre-training and deployment.

Study: Global AI Chipset Market to Exceed $700B with 31.8% CAGR

According to TMR Research, the global artificial intelligence chipset market size is expected to exceed $700 billion, with a compound annual growth rate of 31.8% from 2022 to 2031. The article discusses the development trends, application areas, and key players in the artificial intelligence chipset market, which is highly timely and valuable for readers interested in the artificial intelligence chipset market.

IBM Research: How AI & Automation Protect Businesses from Data Breaches

IBM's report provides sufficient evidence that artificial intelligence, automation, and threat intelligence can address data breaches throughout the lifecycle, reduce costs, and provide stronger evidence. The research found that integrating artificial intelligence and automation into security operations teams can reduce the lifecycle of data breaches by 33% and costs by 33.6%. However, currently, only 28% of enterprises widely apply artificial intelligence and automation. Many enterprises rely on legacy systems, which are easily bypassed by attackers. The significance of this article lies in emphasizing the effectiveness of artificial intelligence and automation in improving cybersecurity and calling on enterprises to widely adopt these technologies to protect data security.

Google's AGI Robot Breakthrough: 54 - Member Team's 7 - Month Work, High Generalization and Reasoning 解释：核心关键词为“谷歌AGI机器人”（Google's AGI Robot）和“新成果”（Breakthrough），标题简洁地概括了主要内容，以动词开头，符合英文习惯，且长度在规定范围内。

The robotics research team at Google DeepMind recently released a robotics project called RT-2. This project took 7 months to develop and uses a large model for training. RT-2 has capabilities such as symbol understanding, reasoning, and human recognition, and can think and complete tasks based on human instructions. By combining the large model with the robot's operational capabilities, RT-2 can accomplish tasks that involve logical leaps, such as from 'extinct animals' to 'plastic dinosaurs'. The results of this project performed well in various sub - category tests, with performance up to three times that of the previous generation of robot models. This research result demonstrates the potential of large models in robotics research and is expected to drive the development of robots in the future.

RWKV: Small Team Aims to Be Android of AI Era with Big Model

Meta Intelligence OS is a startup founded by Bloomberg. It has developed a series of large models based on the open-source model RWKV and aims to become the Android in the era of large models. The RWKV model has superior performance and low cost in inference tasks, thus attracting customers from industries such as finance, law firms, and smart hardware. The business model of Meta Intelligence OS is model customization based on private data and internal AI Agent development. The company hopes to solve the problems of API call latency and data security by deploying large models on terminal devices. Currently, RWKV versions are available on Windows, Mac, and Linux computers, and Android and iOS versions are also in development. Meta Intelligence OS is raising funds and collaborating with chip companies and computing power platforms to create benchmark customers. Luo Xuan said that the decisive battlefield for large models is on hardware, and both terminal devices and the cloud require dedicated chips.

ChatGPT Adds Audio Transcription Feature! A Powerful Tool to Easily Record Meeting Highlights

OpenAI launches ChatGPT audio transcription for macOS users, supporting 120-minute recordings with auto-generated transcripts and summaries. Available only to GPT-4o subscribers, it records system audio and microphone input, deleting recordings post-transcription unless users opt in for model training. Enterprise/education users are excluded by default. Not available on Windows/Android/web yet.....

Google DeepMind Launches MoR Architecture: Expected to Significantly Improve the Efficiency of Large Language Models

Related Recommendations

Study: Global AI Chipset Market to Exceed $700B with 31.8% CAGR

IBM Research: How AI & Automation Protect Businesses from Data Breaches

RWKV: Small Team Aims to Be Android of AI Era with Big Model

ChatGPT Adds Audio Transcription Feature! A Powerful Tool to Easily Record Meeting Highlights