Step Zenith Step 3.5 Flash Release: A Lightweight Open-Source Model Designed for Agents

Stepfun has officially released its latest open-source foundation model - Step3.5Flash. Designed specifically for agent scenarios, this model offers powerful reasoning capabilities and ultra-fast response speeds, aiming to provide developers with a smarter, more stable, and cost-effective "agent brain".

As a highly targeted lightweight model, Step3.5Flash has achieved breakthroughs in multiple dimensions:

Ultra-speed: The inference speed can reach up to 350 TPS (tokens per second), especially excelling in code-related tasks.
Performance comparable to closed-source models: In core agent application scenarios and mathematical logic tasks, its performance is comparable to mainstream closed-source large models.
Stability for long-chain tasks: It has the stability to handle complex, long logical chain tasks and efficiently deals with ultra-long context of 256K.

Technical Architecture: Balancing Efficiency and Depth

Step3.5Flash adopts an advanced sparse MoE (Mixture of Experts) architecture, with a total parameter count of 196 billion, but only about 11 billion parameters are activated per token. To further improve efficiency, the model introduces MTP-3 technology, enabling the prediction of 3 tokens at once, doubling the efficiency. Additionally, by combining a sliding window with global attention, the model can accurately capture key points in long texts, significantly reducing computational costs.

Real-world Testing Across Scenarios: From Code to Edge-Cloud Collaboration

In practical application demonstrations, Step3.5Flash has shown diverse capabilities:

Smart Programming: It can automatically write and output a high-performance visualization platform based on the WebGL2.0 engine, just from a textual description.
Complex Computation: Without using external tools, it can quickly complete difficult mathematical operations such as arithmetic sequence summation and factorial accumulation.
Edge-Cloud Collaboration: As a "cloud-based brain", it can break down users' vague needs (such as comparing prices across platforms) into specific search and scraping sub-tasks, greatly simplifying the difficulty on the local execution end and ensuring the reliability of results.

Currently, Step3.5Flash is fully available on mainstream platforms, including GitHub, HuggingFace, and OpenRouter. To lower the barrier for local deployment, Stepfun has specifically optimized the model's performance on personal workstations (such as NVIDIA DGX and Apple M4Max). Moreover, the company has announced the start of training for the Step4 model and has invited global developers to jointly define the next generation of agent foundation models.

OpenRouter is offering free access, upgrade your agent at 0 cost: https://openrouter.ai/stepfun/step-3.5-flash
GitHub download for quick deployment, build your own agent: https://github.com/stepfun-ai/Step-3.5-Flash/tree/main
Get model weights on HuggingFace: https://huggingface.co/stepfun-ai/Step-3.5-Flash

Robin Li to Deliver Keynote Speech at Create 2025 Baidu AI Developer Conference on April 25th, Unveiling New Developments in Baidu AI

Today, the official poster for Baidu founder Robin Li's first keynote speech in 25 years was released. According to the poster, Li will deliver a one-hour keynote speech titled "The World of Models, the Realm of Applications" at the Create 2025 Baidu AI Developer Conference on April 25th. The speech will focus on hot topics in the AI field, including MCP, intelligent agents, digital humans, and model costs, attracting widespread attention from the industry. A highlight of the conference will be Li's official release of Baidu AI's latest products and business progress. Previously, Baidu has...

Baichuan Intelligence and Renmin University of China Establish 'Large Model Joint Laboratory'

The establishment of the joint laboratory signifies that both parties will engage in in-depth cooperation in cutting-edge technological fields such as large model pre-training, alignment, retrieval enhancement, intelligent agents, and multimodality. Renmin University of China will leverage its talent and technical advantages in large model research, combining forces with Baichuan Intelligence in engineering and product development to jointly advance related research and applications.

Ling Guang App Evolution: Support for Uploading Images to Generate Applications, Integration of Nearly 20 New APIs - Practical and Fun

The Ant AI Assistant Ling Guang upgrades to "Flash Application," adding features such as "Upload Images to Generate Applications" and "Desktop Widgets," integrating nearly 20 API tools, supporting audio synthesis and multimodal understanding, enhancing users' efficiency and interaction experience when creating personalized tools.

Revolutionary AI Medical Large Model Xihe No.1 Makes Its Debut

Peking University Third Hospital and other institutions jointly launched the AI medical large model "Xihe No.1", which uses billions of parameters and millions of medical records, with a medical knowledge coverage rate of 98% and an accuracy rate exceeding 90%, aiming to enhance the level of intelligent medical services.

Step Zenith Step 3.5 Flash Release: A Lightweight Open-Source Model Designed for Agents

Related Recommendations

Robin Li to Deliver Keynote Speech at Create 2025 Baidu AI Developer Conference on April 25th, Unveiling New Developments in Baidu AI

Baichuan Intelligence and Renmin University of China Establish 'Large Model Joint Laboratory'

Rokid Collaborates with Top AI Companies to Launch a Futuristic Smart Glasses!

Ling Guang App Evolution: Support for Uploading Images to Generate Applications, Integration of Nearly 20 New APIs - Practical and Fun

Revolutionary AI Medical Large Model Xihe No.1 Makes Its Debut