AI2 Releases Open Source Dataset for Large Language Model Dolma Containing 3 Trillion Tokens


A team from Carnegie Mellon University has developed a real-time error correction system for 3D printing based on a large language model. The system mimics an orchestra, with an 'conductor' agent coordinating four specialized agents to automatically detect and correct errors caused by small parameter fluctuations during the printing process, solving the problem of traditional open-loop systems that are prone to failure.
Neusoft Group has reached a strategic cooperation with Cerence AI to jointly develop a new generation of intelligent cockpit platform, focusing on intelligent voice and large language model technologies, providing efficient and pre-integrated intelligent interaction solutions for automotive manufacturers around the world. Neusoft will leverage its NAGIC intelligent cockpit platform, combined with Cerence AI's cutting-edge technology, to meet the growing demand for intelligent cockpits in the market.
BrowserUse unveils BU-30B-A3B-Preview, a 30B-parameter MoE model for web automation, balancing high performance with lightweight operation to reduce AI browser costs.....
Japanese data scientist Takahito Honda launched the open-source programming language Sui, aimed at solving the accuracy issues of code generated by large language models, claiming it can achieve 100% accuracy. Its design concept is inspired by Japanese aesthetics "Wabi-Sabi", emphasizing refinement and elimination of redundancy, with core principles including ensuring zero syntax error rate and using numbers as variables.
Ant Technology Research Institute launched the LLaDA2.0 series, including 16B and 100B versions, among which the 100B version is the industry's first billion-parameter discrete diffusion large language model. The model breaks through the scalability bottleneck of diffusion models, significantly improves generation quality and inference speed, and provides a new direction for the development of the field.