Recently, the data analysis intelligent agent "XiYan-SQL" independently developed by Alibaba Cloud's Feitian Lab has performed excellently in the global authoritative SQL diagnostic evaluation benchmark BIRD-CRITIC (also known as SWE-SQL), successfully topping all open rankings, surpassing multiple top teams at home and abroad, and setting a new industry record for SQL diagnosis and repair.

The BIRD-CRITIC benchmark was jointly launched by academia and Google Cloud, aiming to explore whether "large language models can solve user problems in real database applications." The evaluation compiles common database errors, performance issues, and query requirements found in enterprises into questions, covering mainstream database systems such as MySQL, PostgreSQL, SQL Server, and Oracle. The evaluation questions include not only simple queries but also complex insert, update, and delete operations, as well as many new scenarios that the models have not encountered before, making the overall difficulty much higher than traditional "natural language to SQL" tests.
In this evaluation, XiYan-SQL achieved first place on three important rankings: BIRD-CRITIC-1.0-Open, BIRD-CRITIC-PG, and BIRD-CRITIC-Flash, and received authoritative validation in multiple dimensions including cross-dialect robustness, complex SQL processing capability, real problem repair rate, and out-of-distribution generalization.
Technically, XiYan-SQL uses innovative methods such as schema filtering, multi-generator integration, candidate reorganization and optimal selection, enabling the model to generate high-quality SQL while also considering executability and maintainability. This model can provide highly available diagnostic and repair solutions in real systems with dirty data, heterogeneous schemas, and cross-dialect differences.
Currently, the generative business intelligence (GBI) product "XiYan," built based on XiYan-SQL technology, has been launched on Alibaba Cloud's BaiLian platform, offering SQL generation and diagnostic services.
Key Points:
🔍 XiYan-SQL won first place in the BIRD-CRITIC evaluation, surpassing many top teams.
📊 The evaluation covers various mainstream databases, with a difficulty level higher than traditional SQL generation tests.
💻 Related technologies and models are open-sourced, supporting developers to experience and contribute.
