Fast Speed! JD Cloud Launches MiniMax M3 Large Model, Achieving a Leap in Inference Efficiency

Today, the MiniMax M3 model is officially launched in the market, and the JoyBuilder model development platform of JD Cloud has completed the integration synchronously and has opened relevant services to a wide range of users from the first moment.

The core of this technical iteration lies in the significant improvement of inference performance. In terms of application deployment, the platform integrates its self-developed inference framework and deeply combines several cutting-edge inference optimization technologies, including PD separation deployment, KV Cache caching, and speculative sampling.

Thanks to the collaborative efforts of these underlying technologies, the newly integrated model achieves a higher inference throughput in practical operations, and the overall response efficiency has also been significantly improved. This not only provides developers with a smoother calling experience but also has the potential to further accelerate the application of cutting-edge large models in specific business scenarios.

China's Multimodal Large Model Reaches a Milestone: MiniMax M3 Is Officially Open-Sourced and Response Speed Doubles

Xiyu Technology open-sourced its native multimodal flagship model MiniMax M3, with 428B total parameters and 23B activated parameters, the first of its kind in the industry. It previously released weights and a sparse attention mechanism paper, sparking widespread attention. The model ranks first in comprehensive performance among open-source models.....

Programming Outperforms GPT-5.5! MiniMax Launches New M3 Large Model with Three Unique Open-Source Capabilities Globally

Xiyu Technology released the next-generation large language model MiniMax M3, featuring top-tier programming capabilities, a 1-million-token ultra-long context window, and native multimodal interaction. It is the first model in China to combine these three technical indicators and the only open-source model globally with such performance, achieving outstanding results in multiple authoritative benchmarks.....

Exceeding GPT-5.5! Domestic AI Large Model MiniMax M3 Officially Released

MiniMax M3, a new open-source model from Xiyu Technology, features cutting-edge programming capabilities, 1M ultra-long context, and native multimodal abilities (image, video input, and desktop operation), making it the first domestic model to integrate these three core features. It leads in multiple metrics on the SWE-Bench programming benchmark.....

MiniMax Launches M3 Large Model: Pioneering the MSA Architecture and Supporting 1M Context, Fully Open-Source to Compete with Overseas Flagships

MiniMax Xiyu Technology launched its next-generation cutting-edge large model M3 on June 1, 2026. This is the first domestic open-source model integrating top-tier programming, 1M ultra-long context, and native multi-modal capabilities, competing with overseas closed-source flagships. To address the context expansion bottleneck in complex intelligent agent tasks, M3 independently developed the sparse attention architecture (MSA), achieving more precise KV block division and operator-level optimization. The computational speed is more than four times faster than similar open-source solutions, with a significant reduction in computation per token at 1M context.

Fast Speed! JD Cloud Launches MiniMax M3 Large Model, Achieving a Leap in Inference Efficiency

Related Recommendations

China's Multimodal Large Model Reaches a Milestone: MiniMax M3 Is Officially Open-Sourced and Response Speed Doubles

Programming Outperforms GPT-5.5! MiniMax Launches New M3 Large Model with Three Unique Open-Source Capabilities Globally

Exceeding GPT-5.5! Domestic AI Large Model MiniMax M3 Officially Released

MiniMax Launches M3 Large Model: Pioneering the MSA Architecture and Supporting 1M Context, Fully Open-Source to Compete with Overseas Flagships

FriendliAI Secures $20 Million in Funding, AI Model Inference Efficiency Improved