Alibaba officially released the flagship reasoning model of the Qwen series - Qwen3-Max-Thinking. The model has achieved a leap in core dimensions such as complex reasoning, factual knowledge, and agent capabilities, and is claimed to have surpassed the trillion parameter mark. In multiple authoritative benchmark tests, its overall performance can now compete with globally top models such as GPT-5.2-Thinking, Claude-Opus-4.5, and Gemini3Pro.

image.png

Qwen3-Max-Thinking adopts a more large-scale reinforcement learning training and introduces two core innovative technologies: "Adaptive Tool Calling" and "Test-Time Expansion." It not only can autonomously call search engines, memory, and code interpreters while thinking, like human experts, but also significantly reduces model hallucination, making it more intelligent and smooth when handling complex tasks in the real world.

Currently, Qwen3-Max-Thinking has been officially launched on Qwen Chat for user interaction experience, and its API (model name: qwen3-max-2026-01-23) is also open to developers simultaneously.

Key Points:

  • 🚀 Performance Competes with International Top Models: In 19 authoritative tests, the performance of Qwen3-Max-Thinking is comparable to GPT-5.2 and Claude-4.5, placing it at an international leading level.

  • 🤖 Native Agent Capabilities: The model has adaptive tool calling capabilities, and can autonomously select search engines or code interpreters based on task requirements, achieving "thinking while using."

  • 🧠 Trillion-Parameter Reasoning: Through larger-scale reinforcement learning and test-time expansion (Test-Time Scaling) technology, it significantly improves performance in high-difficulty areas such as scientific knowledge and mathematical reasoning.