Article

Step-Audio-R1.1 by Step-Phoenix Rises to the Top of the Global Charts

Published in Latest AI News

Time :Jan 15, 2026

Read :4minute

StepZen Star Company announced that its open-source native speech reasoning model, Step-Audio-R1.1, has achieved first place on a globally renowned artificial intelligence model evaluation ranking. This ranking was released by Artificial Analysis Speech Reasoning and focuses on evaluating the capabilities of speech models in audio processing and logical reasoning, covering multiple dimensions such as accuracy and response time.

Step-Audio-R1.1 surpassed leading closed-source models such as Grok, Gemini, and GPT-Realtime with an accuracy rate of 96.4%, setting a new historical record. In the comprehensive evaluation of performance and speed, Step-Audio-R1.1 demonstrated strong capabilities and has become a focal point in the industry.

This model features deep speech reasoning and real-time response capabilities, allowing it to understand speech content end-to-end without additional delay, with the characteristic of "thinking like a human when hearing a conversation." The latest version not only improves real-time dialogue capabilities but also enhances complex speech reasoning abilities. The complete real-time speech API is planned to be launched in February next year. Currently, users can experience the core functions of R1.1 through the open chat mode, supporting streaming inference that allows users to think and speak simultaneously.

At the launch event, StepZen demonstrated the model's capabilities in practical applications, such as analyzing cat fight sounds and understanding Korean lyrics. These examples showcase the analytical capabilities and speech comprehension level of Step-Audio-R1.1, further proving its excellent performance in complex audio environments.

The weights of Step-Audio-R1.1 have been uploaded to HuggingFace, and developers and researchers can freely download and use them. At the same time, users can also try it at StepZen's Open Platform Experience Center. For those interested in AI technology and speech models, this is undoubtedly an opportunity worth looking forward to.

huggingface: https://huggingface.co/stepfun-ai/Step-Audio-R1.1

Key Points:
🌟 Step-Audio-R1.1 ranks first globally with 96.4% accuracy in international evaluations!
📈 The model has deep speech reasoning and real-time response capabilities, supporting streaming inference.
💻 Users can freely download the model from HuggingFace and try it out on the open platform.

Related Recommendations

Domestic Large Models Enter the Top Ten Globally! Wenxin ERNIE 5.0 Tops the LMArena Ranking, Mathematical Capabilities Approaching GPT

Baidu released the Wenxin Large Model ERNIE-5.0-0110, ranking eighth in the LMArena Global Text Ability Ranking, making it the only domestic model to enter the top ten. Its performance in the field of mathematical reasoning is particularly outstanding, ranking second globally, just behind the unreleased GPT-5.2-High, demonstrating breakthroughs of domestic large models in specialized fields.

Jan 15, 2026

242.6k

Robots Can Also Take Short-term Jobs: Qingtian Rent Completes Financing with User Base Exceeding 200,000 Aiming to Cover 200 Cities Nationwide

On January 15, the robot leasing platform "Qingtian Rent" announced the completion of its seed round financing, led by Hillhouse Venture with participation from multiple institutions. The funds will be used for nationwide market expansion and service system construction. The platform was jointly initiated by embodied intelligence companies such as Zhiyuan Robotics, adopting a shared leasing model to lower the threshold for enterprises to use robots and promote the commercial application of embodied intelligence technology.

Jan 15, 2026

189.2k

Order Tea with One Sentence, Tongyi App Completes the AI Payment Loop, Alipay AI Pay Enables Intelligent Agents to Actually Get Things Done

The Tongyi Qianwen App integrates with Taobao Flash Sales and Alipay AI Pay. Users can complete product recommendations, orders, and payments through natural conversation without switching apps, achieving 'say a sentence and it's delivered'. This marks a key advancement in AI intelligent agents moving from 'being able to answer' to 'being able to actually get things done'.

Jan 15, 2026

236.9k

Buffett Warns: The Threat of Artificial Intelligence is Comparable to Nuclear Weapons

Buffett compares AI to nuclear weapons, worrying about the uncertainty of its development. He points out that even leaders in the AI field cannot predict the end of the technology, and this unknown nature poses a significant risk.

Jan 15, 2026

219.2k

Chinese Flagship Phones Are All Betting on AI Physical Buttons? Honor Magic8 Goes First, Five Major Manufacturers' Competition Intensifies

Five major domestic smartphone manufacturers are testing AI physical buttons, sparking discussions. Supporters believe they can quickly launch applications, while opponents question their necessity, considering their functions limited.

Jan 15, 2026

186.4k

Intelligent Future, Your Artificial Intelligence Solution Think Tank

English 简体中文繁體中文にほんご