On April 29, iFLYTEK officially launched the new Spark X2-Flash model and simultaneously opened the API interface, marking the entry of large model applications based on the domestic computing power ecosystem into a new stage of efficiency.
The model adopts the current mainstream MoE (Mixture of Experts) architecture, with a total parameter count of 30B. The most remarkable feature is its support for an ultra-long context of up to 256K. Notably, the Spark X2-Flash was fully trained on Huawei's Ascend 910B cluster, demonstrating the collaborative capabilities of domestic software and hardware in the field of deep learning training.

In terms of core performance, the Spark X2-Flash has achieved significant improvements in agent (Agent) and code generation capabilities. According to third-party test data, the model's performance in handling complex tasks such as processing in-depth research reports, Skill management and invocation, and system control execution has already reached the level of top-tier models with trillions of parameters in the industry.
Regarding the cost issues that developers care about, the Spark X2-Flash performs excellently. In the same workflow test, its token consumption is only one-third of that of currently mainstream large-scale models, greatly reducing the threshold for building complex agent applications. For example, when creating complex video generation skills, the model not only quickly understands the requirements but also provides detailed explanations from the skill structure to the core functions.

On the technical foundation, the Spark X2-Flash is the first to combine DSA (Sparse Attention) and MTP (Multi-Token Prediction) technologies on domestic chips. This innovation solves the problem of slow training for long texts on domestic computing platforms, improving training efficiency by 4.5 times compared to clusters of the same scale. In addition, for agent reinforcement learning scenarios, the model has improved sampling inference efficiency by more than two times through dual optimization of algorithms and engineering, effectively alleviating performance bottlenecks in long interaction scenarios.
Currently, applications such as AstronClaw and Loomy have been the first to complete integration. At the same time, the model has also achieved deep compatibility with international mainstream agent frameworks such as OpenClaw and Claude Code, providing global developers with a more cost-effective domestic computing power solution.
