Article

Aliyun XiYan-SQL Strongly Dominates and Ranks First in the Global SQL Diagnosis Evaluation List!

Published in Latest AI News

Time :Dec 5, 2025

Read :4minute

Recently, the data analysis intelligent agent "XiYan-SQL" independently developed by Alibaba Cloud's Feitian Lab has performed excellently in the global authoritative SQL diagnostic evaluation benchmark BIRD-CRITIC (also known as SWE-SQL), successfully topping all open rankings, surpassing multiple top teams at home and abroad, and setting a new industry record for SQL diagnosis and repair.

The BIRD-CRITIC benchmark was jointly launched by academia and Google Cloud, aiming to explore whether "large language models can solve user problems in real database applications." The evaluation compiles common database errors, performance issues, and query requirements found in enterprises into questions, covering mainstream database systems such as MySQL, PostgreSQL, SQL Server, and Oracle. The evaluation questions include not only simple queries but also complex insert, update, and delete operations, as well as many new scenarios that the models have not encountered before, making the overall difficulty much higher than traditional "natural language to SQL" tests.

In this evaluation, XiYan-SQL achieved first place on three important rankings: BIRD-CRITIC-1.0-Open, BIRD-CRITIC-PG, and BIRD-CRITIC-Flash, and received authoritative validation in multiple dimensions including cross-dialect robustness, complex SQL processing capability, real problem repair rate, and out-of-distribution generalization.

Technically, XiYan-SQL uses innovative methods such as schema filtering, multi-generator integration, candidate reorganization and optimal selection, enabling the model to generate high-quality SQL while also considering executability and maintainability. This model can provide highly available diagnostic and repair solutions in real systems with dirty data, heterogeneous schemas, and cross-dialect differences.

Currently, the generative business intelligence (GBI) product "XiYan," built based on XiYan-SQL technology, has been launched on Alibaba Cloud's BaiLian platform, offering SQL generation and diagnostic services.

Key Points:
🔍 XiYan-SQL won first place in the BIRD-CRITIC evaluation, surpassing many top teams.
📊 The evaluation covers various mainstream databases, with a difficulty level higher than traditional SQL generation tests.
💻 Related technologies and models are open-sourced, supporting developers to experience and contribute.

Related Recommendations

NVIDIA Launches New AI Framework, 8-Billion-Parameter Model Empowers Intelligent Tool Management

NVIDIA and HKU launched the 8-billion-parameter Orchestrator model, which coordinates tools and LLMs to solve complex tasks efficiently. It outperforms benchmarks in tool usage with lower costs and adapts to user preferences. Trained via the ToolOrchestra RL framework, it enhances small models' coordination skills.....

Dec 5, 2025

311.0k

2025 Global Top 500 Unicorn Companies Revealed! SpaceX, ByteDance, and OpenAI Lead the Way, Chinese Companies Strongly Enter the List

On December 3, the 2025 Global Top 500 Unicorn Companies Conference was held in Laoshan District, Qingdao. The conference released the '2025 Global Top 500 Unicorn Companies Report', with evaluation criteria including a valuation of over 7 billion yuan, unique technology, and business models. The report shows that the total valuation of global unicorn companies in 2025 reached 3.914 trillion yuan, achieving growth compared to last year.

Dec 5, 2025

432.9k

The Japanese Government Uses AI Technology to Early Identify Adolescents with Suicidal Tendencies

The Japanese government is advancing an AI initiative aimed at early identification of adolescents with suicidal tendencies and providing psychological support to address the issue of adolescent suicide. This effort comes amid increasing discussions about the negative impacts of AI, particularly following recent lawsuits against OpenAI over AI tools that may induce suicide among teenagers, sparking widespread public concern about the risks of AI applications.

Dec 5, 2025

185.5k

Gaode Launches AI Parking Radar: Minute-Level Prediction of Available Parking Spaces, Beijing Leads the Way

Gaode Map's 'AI Parking Radar' uses spatial intelligence and AI vision to provide real-time, minute-level updates on street parking availability, easing parking anxiety with lane-level navigation views. Currently available in Beijing, covering tens of thousands of spots, it extends navigation apps into parking services.....

Dec 5, 2025

223.0k

American Broadcaster Falls into a Harassment Scandal Due to AI Advice, Faces 70 Years in Prison!

A 31-year-old podcaster faces charges for cyberstalking and interstate threats, potentially resulting in 70 years in prison and a $3.5 million fine. He expressed a desire for a 'wife' and extreme anger toward women on social media, referring to ChatGPT as his 'best friend,' highlighting AI's negative role in the case.....

Dec 5, 2025

181.8k

Intelligent Future, Your Artificial Intelligence Solution Think Tank

English 简体中文繁體中文にほんご