Breaking News! MiniMax M3 Will Be Released: Sparse Attention Architecture Breakthrough, Context Efficiency Increases by Millions of Tokens

AIbase Report Latest news in Beijing Time, domestic AI unicorn MiniMax is about to launch its new large model M3. Skyler Miao, the AI engineering lead at MiniMax, recently released a teaser on social media, saying "Something BIG is coming!" which has attracted widespread attention in the industry.

M3 Core Architecture Innovation: Sparse Attention Mechanism

According to the information, M3 adopts a new sparse attention (Sparse Attention) architecture, combining fast indexing through the Index Branch with precise computation through the Sparse Branch, effectively solving the computational bottlenecks in ultra-long context scenarios.

Traditional Transformers face a quadratic growth in computational load when handling contexts of millions of tokens. However, M3's sparse design significantly reduces this cost, achieving a notable efficiency leap while maintaining high performance, providing strong support for applications such as long text understanding, long conversations, and multi-document analysis.

Test Performance Significantly Outperforms M2

Compared to its predecessor M2 (which supports 1M token context), M3 has achieved breakthrough improvements in key metrics:

Speed in the Prefill phase increased by 9.7 times
Speed in the Decoding phase increased by 15.6 times

This means that in practical deployment, M3 can efficiently process ultra-long contexts with minimal computational costs, significantly reducing inference costs and opening up new possibilities for more complex AI applications.

Industry Implications: A New Benchmark for Efficiency in the Era of Long Contexts

MiniMax's announcement of M3 once again highlights the competitiveness of domestic AI companies in architectural innovation. The breakthroughs in technologies like sparse attention are expected to shift the focus of large models from "competition in parameter scale" to "competition in efficiency and practicality," bringing more affordable and efficient experiences for enterprise-level applications and consumer use.

At present, MiniMax has not yet announced the specific release date or full parameter scale of M3. However, based on the engineer's teaser and the performance data, this model is expected to become a strong competitor in the field of long context processing. AIbase will continue to monitor the subsequent developments of MiniMax M3 and bring you the latest updates in a timely manner.

Step Astronomy Releases Step Edge Series Terminal Models for Efficient Local Multimodal Processing

Step Star launched the Step Edge model series for phones and vehicles, including Basic, Audio, GUI, and Gen versions. These models process text, image, and audio locally, enabling screen understanding, speech recognition, interface control, and image generation. Tool invocation latency is as low as 0.1s. Simple, frequent, or weak-network tasks run fully on-device; complex reasoning is handled by cloud, achieving efficient device-cloud collaborati....

27B Large Model Fits into iPhone! Apple Focuses on AI Compression Tech: Volume Reduced to 1/14, Speed Increased 8 Times

Tech media The Information reported that Apple is in talks with AI startup PrismML to evaluate the feasibility of running larger AI models directly on iPhones. PrismML's core breakthrough is its native 1-bit model compression technology, which can compress model size to about 1/14 and reduce memory usage by over 90%. This move could enable large-scale AI models to run on mobile devices, achieving a breakthrough in edge AI.

Breaking News! MiniMax M3 Will Be Released: Sparse Attention Architecture Breakthrough, Context Efficiency Increases by Millions of Tokens

M3 Core Architecture Innovation: Sparse Attention Mechanism

Test Performance Significantly Outperforms M2

Industry Implications: A New Benchmark for Efficiency in the Era of Long Contexts

Related Recommendations

Step Astronomy Releases Step Edge Series Terminal Models for Efficient Local Multimodal Processing

MiniMax Completes New Round of Financing Worth 16 Billion Hong Kong Dollars; Founder Yan Junjie Announces Zero Salary Until Achieving AGI

Generating 2 billion USD! MiniMax completes a new round of financing worth 16 billion HKD with more than 7 times oversubscription

27B Large Model Fits into iPhone! Apple Focuses on AI Compression Tech: Volume Reduced to 1/14, Speed Increased 8 Times

He Gao AI Sector Remains Hot: Zhipu and MiniMax Show Strong Performance on the Unblocking Day