Tencent's self-developed large model, HuanYuan 2.0 (Tencent HY2.0), has been officially released. At the same time, DeepSeek V3.2 is gradually integrated into Tencent's ecosystem. Currently, these two models have been launched first in Tencent's AI-native applications such as Yuanbao and ima. Tencent Cloud has also opened up related model APIs and platform services simultaneously.

The newly released Tencent HY2.0 adopts a Mixture of Experts (MoE) architecture, with a total parameter count of up to 406B (activated parameters 32B), supports a 256K ultra-long context window, and its reasoning capabilities and efficiency are at the industry's leading level.

Tencent HuanYuan 2.0 Officially Launched: Industry-Leading Reasoning Capabilities and Efficiency

Compared to the previous version (Hunyuan-T1-20250822), HY2.0Think has made significant improvements in pre-training data and reinforcement learning strategies. In complex reasoning scenarios such as mathematics, science, code, and instruction following, its overall performance remains in the top tier domestically, and its generalization ability has also significantly improved.

In mathematical and scientific knowledge reasoning, HY2.0Think uses high-quality data for Large Rollout reinforcement learning, greatly enhancing its reasoning capabilities. It achieved outstanding results in authoritative tests such as the International Mathematical Olympiad (IMO-AnswerBench) and the Harvard-MIT Mathematics Competition (HMMT2025). With the improvement in pre-training data, the model has also made significant progress in tasks that heavily test knowledge levels like Humanity's Last Exam (HLE) and the generalization ability of ARC AGI.

Tencent HuanYuan 2.0 Officially Launched: Industry-Leading Reasoning Capabilities and Efficiency

In terms of instruction following and long-text multi-turn capabilities, HY2.0Think alleviates the inconsistency between training and inference through importance sampling correction, achieving efficient and stable training for long-window RL. Meanwhile, by leveraging diverse verifiable task sandboxes and reinforcement learning based on scoring criteria, the model has significantly improved its performance in instruction following and multi-turn tasks such as Multi Challenge.

In code and agent capabilities, Tencent has built a scalable verifiable environment and high-quality synthetic data, greatly enhancing the model's practical capabilities in Agentic Coding and complex tool calling scenarios. The model has made a leap in intelligent agent tasks such as SWE-bench Verified and Tau2-Bench, which are aimed at real-world application scenarios.