Step 3.7 Flash Launches: The New Era of Agent Efficiency Has Truly Arrived

Today, Step3.7Flash is officially launched. This open-source model directly addresses the core pain points of the Agent era—efficiency, reliability, and multimodal execution capabilities. It has quickly attracted industry attention by being released with open weights (under the Apache 2.0 license).

⚡️ Step 3.7 Flash is here_ The new frontier is age.jpg

Leading Benchmark Performance, Strong Practical Capabilities

Step3.7Flash has achieved outstanding results in multiple key evaluations:

First place in ClawEval-1.1 (67.1 points)
First place in SimpleVQA Search (79.2 points)
Second place in SWE-PRO (56.3 points)
Scored as high as 95.3 in V* Python

These achievements demonstrate its leading competitiveness in complex scenarios such as Agent tasks, code generation, and visual search.

Core Parameters: A Balanced Achievement of Speed, Cost, and Capability

As a model designed specifically for Agentic, coding, searching, and multimodal workflows, Step3.7Flash has made significant breakthroughs in speed and efficiency:

Reasoning Speed: Up to 400 TPS
Architecture: 198B sparse MoE structure, with about 11B active parameters
Context Length: Supports up to 256K
Reasoning Level: Offers three levels of reasoning

While maintaining high performance, it significantly reduces actual deployment costs, providing developers with an efficient option.

Multimodal Understanding + Reliable Execution, Truly "See and Do"

The biggest highlight of Step3.7Flash is its strong perception-action loop capability. It can understand UI interfaces, charts, documents, and images, and then autonomously write code or call tools to perform operations accordingly.

Its enhanced Web+ visual search function can access more information sources and supports in-depth follow-up queries. At the same time, the reliability of tool calls has been significantly improved, achieving a success rate of over 98% on all difficulty levels of τ²-bench, effectively reducing common issues like target drift and tool call failures.

Ecosystem Compatibility and Friendly Local Deployment

The model has achieved good compatibility with mainstream agent frameworks such as Claude Code, KiloCode, Hermes Agent, and OpenClaw, as well as protocols like MCP. It also supports local operation on hardware such as Mac Studio M4Max, DGX Spark, and AMD AI Max+395, offering convenience for local deployment and privacy-sensitive scenarios.

AIbase Commentary

AI Research Enters the Autonomous Driving Era: Yang Zhilin Discusses the Third Stage of Large Model Training

AI research paradigm is undergoing profound transformation. At the 2026 Zhongguancun Forum, Yang Zhilin, founder of Moonshot AI, noted that AI R&D has entered the third stage of 'AI-led research.' Starting 2026, past reliance on human-crafted rules and fine-tuning will be overturned, as AI increasingly leads its own development.....

Volc Engine Launches Doubao Seedance 2.5 Video Model, the Azuris Large Model Service Has Over 1.1 Million Users

On June 23, 2026, Volcano Engine launched its Seedance2.5 video generation model at the Summer FORCE Conference, set to release in July. It offers three key breakthroughs: direct generation of 30-second native videos, joint generation from up to 50 multimodal assets, and local editing with visual consistency. President Tan Dai noted that video generation is crucial to world models.....

Lee Kai-Fu: Open Source Models Are a Better Path to AI Sovereignty

Lee Kai-Fu introduced the concept of "AI Sovereignty," emphasizing that it involves technological control, data security, and the adaptation of models to local cultural and legal frameworks. He believes that countries do not need to blindly reinvent OpenAI, as closed-source self-development is costly and unrealistic. For countries and companies with limited resources, building a localized system based on open source models is a more feasible third option.