Kunlun Weiwei releases and open-sources Skywork-SWE-32B: Leading a new trend with an open-source software engineering intelligence model

In the field of software engineering, Kunlun Weibo officially released its self-developed code intelligence base model Skywork-SWE-32B on June 20 and made it open source. The model performs excellently in software engineering tasks and has become the strongest code repair capability model in the industry with a parameter scale of 32B. The Kunlun Weibo team created the largest verifiable dataset currently available by constructing over 10,000 verifiable GitHub repository task instances, systematically verifying the data scaling law of large models in software engineering tasks.

Skywork-SWE-32B achieved a pass@1 accuracy rate of 38.0% on the SWE-bench Verified benchmark, setting a new record for the best performance among the Qwen2.5-Coder-32B series models under the OpenHands code framework. By introducing test-time expansion technology, the model's performance further improved to 47.0%, surpassing all existing open-source models below the 32B parameter scale and narrowing the performance gap with some closed-source models.

The Kunlun Weibo team addressed the issues in the mainstream datasets for SWE tasks in the current market by establishing an automated three-stage process for collecting and verifying training data. In the data collection phase, they used the GitHub API to scrape information from over 150,000 open-source repositories and, through a series of strict screening steps, retained 23,389 task samples. In the verification phase, the team used unified command generation and Docker environment construction techniques to ensure the validity of each task sample, ultimately generating 10,169 high-quality samples.

In the agent trajectory generation phase, the team used the open-source OpenHands framework combined with commercial large models as the base to execute multiple rounds of interaction for each task, comprehensively recording the problem-solving process of the agent. Ultimately, they built 8,209 high-quality validated trajectories, providing a solid foundation for the training of Skywork-SWE-32B.

The successful release of Skywork-SWE-32B has injected new vitality into the development of software engineering agents, showcasing its capabilities and potential in handling complex development scenarios.

Blog address 🔗

https://quixotic-sting-239.notion.site/eb17f379610040ceb54da5d5d24065bd

HuggingFace address 🔗

https://huggingface.co/Skywork/Skywork-SWE-32B

Key points:

🌟 The Skywork-SWE-32B model achieved a pass@1 accuracy rate of 38.0% on the SWE-bench Verified benchmark, setting a new record for the best performance among 32B open-source models.

📈 After introducing test-time expansion technology, the model's accuracy increased to 47.0%, significantly narrowing the performance gap with closed-source models.

🔍 Kunlun Weibo established an automated process to build more than 10,000 high-quality, verifiable SWE task datasets, laying the groundwork for model training.

Study: Global AI Chipset Market to Exceed $700B with 31.8% CAGR

According to TMR Research, the global artificial intelligence chipset market size is expected to exceed $700 billion, with a compound annual growth rate of 31.8% from 2022 to 2031. The article discusses the development trends, application areas, and key players in the artificial intelligence chipset market, which is highly timely and valuable for readers interested in the artificial intelligence chipset market.

IBM Research: How AI & Automation Protect Businesses from Data Breaches

IBM's report provides sufficient evidence that artificial intelligence, automation, and threat intelligence can address data breaches throughout the lifecycle, reduce costs, and provide stronger evidence. The research found that integrating artificial intelligence and automation into security operations teams can reduce the lifecycle of data breaches by 33% and costs by 33.6%. However, currently, only 28% of enterprises widely apply artificial intelligence and automation. Many enterprises rely on legacy systems, which are easily bypassed by attackers. The significance of this article lies in emphasizing the effectiveness of artificial intelligence and automation in improving cybersecurity and calling on enterprises to widely adopt these technologies to protect data security.

Google's AGI Robot Breakthrough: 54 - Member Team's 7 - Month Work, High Generalization and Reasoning 解释：核心关键词为“谷歌AGI机器人”（Google's AGI Robot）和“新成果”（Breakthrough），标题简洁地概括了主要内容，以动词开头，符合英文习惯，且长度在规定范围内。

The robotics research team at Google DeepMind recently released a robotics project called RT-2. This project took 7 months to develop and uses a large model for training. RT-2 has capabilities such as symbol understanding, reasoning, and human recognition, and can think and complete tasks based on human instructions. By combining the large model with the robot's operational capabilities, RT-2 can accomplish tasks that involve logical leaps, such as from 'extinct animals' to 'plastic dinosaurs'. The results of this project performed well in various sub - category tests, with performance up to three times that of the previous generation of robot models. This research result demonstrates the potential of large models in robotics research and is expected to drive the development of robots in the future.

RWKV: Small Team Aims to Be Android of AI Era with Big Model

Meta Intelligence OS is a startup founded by Bloomberg. It has developed a series of large models based on the open-source model RWKV and aims to become the Android in the era of large models. The RWKV model has superior performance and low cost in inference tasks, thus attracting customers from industries such as finance, law firms, and smart hardware. The business model of Meta Intelligence OS is model customization based on private data and internal AI Agent development. The company hopes to solve the problems of API call latency and data security by deploying large models on terminal devices. Currently, RWKV versions are available on Windows, Mac, and Linux computers, and Android and iOS versions are also in development. Meta Intelligence OS is raising funds and collaborating with chip companies and computing power platforms to create benchmark customers. Luo Xuan said that the decisive battlefield for large models is on hardware, and both terminal devices and the cloud require dedicated chips.

Huawei Unveils New Harmony Intelligent Body, Over 50 Applications to be Launched Soon

Based on the Harmony Intelligent Body, the interaction mode between consumers and the Harmony system and applications will undergo a fundamental transformation. The Harmony Intelligent Body features system-level security, trustworthiness, and personalized controllability. It can achieve efficient collaboration among multiple intelligent bodies and natural transitions between multiple devices, thereby truly transforming the interaction mode from the traditional instruction-centered model to an intention-centered model.

Kunlun Weiwei releases and open-sources Skywork-SWE-32B: Leading a new trend with an open-source software engineering intelligence model

Related Recommendations

Study: Global AI Chipset Market to Exceed $700B with 31.8% CAGR

IBM Research: How AI & Automation Protect Businesses from Data Breaches

RWKV: Small Team Aims to Be Android of AI Era with Big Model

Huawei Unveils New Harmony Intelligent Body, Over 50 Applications to be Launched Soon