Today, Step3.7Flash is officially launched. This open-source model directly addresses the core pain points of the Agent era—efficiency, reliability, and multimodal execution capabilities. It has quickly attracted industry attention by being released with open weights (under the Apache 2.0 license).

⚡️ Step 3.7 Flash is here_ The new frontier is age.jpg

Leading Benchmark Performance, Strong Practical Capabilities

Step3.7Flash has achieved outstanding results in multiple key evaluations:

  • First place in ClawEval-1.1 (67.1 points)
  • First place in SimpleVQA Search (79.2 points)
  • Second place in SWE-PRO (56.3 points)
  • Scored as high as 95.3 in V* Python

These achievements demonstrate its leading competitiveness in complex scenarios such as Agent tasks, code generation, and visual search.

Core Parameters: A Balanced Achievement of Speed, Cost, and Capability

As a model designed specifically for Agentic, coding, searching, and multimodal workflows, Step3.7Flash has made significant breakthroughs in speed and efficiency:

  • Reasoning Speed: Up to 400 TPS
  • Architecture: 198B sparse MoE structure, with about 11B active parameters
  • Context Length: Supports up to 256K
  • Reasoning Level: Offers three levels of reasoning

While maintaining high performance, it significantly reduces actual deployment costs, providing developers with an efficient option.

Multimodal Understanding + Reliable Execution, Truly "See and Do"

The biggest highlight of Step3.7Flash is its strong perception-action loop capability. It can understand UI interfaces, charts, documents, and images, and then autonomously write code or call tools to perform operations accordingly.

Its enhanced Web+ visual search function can access more information sources and supports in-depth follow-up queries. At the same time, the reliability of tool calls has been significantly improved, achieving a success rate of over 98% on all difficulty levels of τ²-bench, effectively reducing common issues like target drift and tool call failures.

Ecosystem Compatibility and Friendly Local Deployment

The model has achieved good compatibility with mainstream agent frameworks such as Claude Code, KiloCode, Hermes Agent, and OpenClaw, as well as protocols like MCP. It also supports local operation on hardware such as Mac Studio M4Max, DGX Spark, and AMD AI Max+395, offering convenience for local deployment and privacy-sensitive scenarios.

AIbase Commentary