Ant Group's Baoling Large Model Adds New Open-Source Member: Ling-2.6-flash Launches Officially

The Baoling large model series under Ant Group has received a major update today, with Ling-2.6-flash officially open to developers worldwide. To adapt to different hardware environments and reduce the deployment threshold, this model also launched multiple precision versions including BF16, FP8, and INT4, aiming to provide developers with more flexible inference options.

As an Instruct model with a total parameter count of 104B and an activated parameter count of 7.4B, Ling-2.6-flash was previously tested under the anonymous identity "Elephant Alpha" on the OpenRouter platform. During a two-week trial period, the development team collected a large amount of real feedback and made targeted optimizations, significantly improving the smoothness of Chinese-English natural switching and enhancing its compatibility with mainstream programming frameworks.

Technical Highlights: Hybrid Architecture and Extreme Efficiency

Ling-2.6-flash's core competitiveness lies in its unique architecture design and high operational efficiency:

Hybrid Linear Architecture: Through underlying computational optimization, the model demonstrates excellent inference speed. With 4 H20 cards, its inference speed can reach up to 340 tokens/s. In the Prefill (pre-fill) throughput metric, it reached 2.2 times that of Nemotron-3-Super, significantly reducing response latency.
Outstanding "Smart Efficiency Ratio": The development team conducted in-depth calibration of token efficiency during training. Evaluation data shows that for tasks of the same quality, Ling-2.6-flash only consumes about 15M tokens, which is one-tenth of that of similar competitors, greatly reducing commercial costs.

Scenario Deepening: Targeted Enhancement of Agent Capabilities

For the most widely used agent (intelligent entity) scenarios in large models, Ling-2.6-flash has undergone specialized enhancement. Whether in complex tool calls, multi-step logical planning, or final task execution, the model performs stably. In several industry-standard evaluations such as BFCL-V4 and SWE-bench, even when facing models with larger activated parameter scales, Ling-2.6-flash can maintain comparable or even state-of-the-art (SOTA) levels.

Currently, developers can access the open-source resources of this model through Hugging Face and ModelScope (Moba Community), further exploring its potential in various industry applications.

Tenfold Improvement in Efficiency: Ant Group's Bai Ling Large Model Ling-2.6-flash Officially Open-Sourced

Ant Group's Bailing large model open-sourced Ling-2.6-flash today, offering BF16, FP8, INT4 quantization versions to lower AI deployment barriers. With 104B total parameters and 7.4B active parameters, it previously excelled in anonymous international benchmarks and underwent multiple optimizations for Chinese-English switching and code generation.....

Claude Deeply Integrates Eight Powerful Tools Like Adobe and Blender, Marking the Beginning of the AI Art Creation and Practice Era?

Anthropic announced deep integration of Claude with eight creative software including Adobe and Blender, embedding AI capabilities into graphic design, 3D modeling, and audio production workflows. Notably, the Adobe connector allows creators to invoke Claude directly within familiar tools, boosting efficiency.....

DeepSeek-V4 Preview Version Officially Released: 1M Long Context Enters the Era of Universal Access

DeepSeek launches DeepSeek-V4 preview series as open-source, achieving million-character ultra-long context and leading domestic Agent collaboration and knowledge reasoning. Offers Pro (1.6T parameters, 49B activated) and Flash versions; Pro matches top closed-source models, Flash balances efficiency.....

Ant Group's Baoling Large Model Adds New Open-Source Member: Ling-2.6-flash Launches Officially

Technical Highlights: Hybrid Architecture and Extreme Efficiency

Scenario Deepening: Targeted Enhancement of Agent Capabilities

Related Recommendations

Tenfold Improvement in Efficiency: Ant Group's Bai Ling Large Model Ling-2.6-flash Officially Open-Sourced

NVIDIA Releases a Multimodal All-Round Model with Inference Efficiency 9 Times That of Competitors

Claude Deeply Integrates Eight Powerful Tools Like Adobe and Blender, Marking the Beginning of the AI Art Creation and Practice Era?

DeepSeek-V4 Preview Version Officially Released: 1M Long Context Enters the Era of Universal Access

Bailing Large Model Officially Released Ling-2.6-flash: Achieving Ultra High Performance at 1/10 the Cost