Recently, Mingshi Intelligence, in collaboration with Tsinghua University and the OpenBMB open-source community, officially released and open-sourced China's first large model trained on Huawei's Ascend platform, BitCPM-CANN, which uses a ternary (1.58-bit) representation. This model has achieved significant breakthroughs in the field of low-bit large model training, marking another milestone in China's artificial intelligence technology.

The release of BitCPM-CANN not only demonstrates the powerful capabilities of domestic computing platforms but also realizes a fully native development chain from quantization operators to training algorithms. The model comes in four sizes: 0.5B, 1B, 3B, and 8B, with outstanding performance. Compared with the full-precision MiniCPM4 family of the same size, the results of the comparative evaluation are encouraging. BitCPM-CANN can release about six times the memory benefits during inference, meaning an 8B parameter model can run smoothly on current mainstream flagship smartphones, bringing great convenience to the smartphone industry.

image.png

According to the official introduction, Mingshi Intelligence built a complete low-bit training foundation based on MindSpeed and Megatron-LM, covering environment adaptation, support for 32K long sequences, parallel strategies, and integrated operators, among other engineering systems. In the future, all low-bit training work targeting Ascend can rely on this public infrastructure. This not only lowers the development barrier but also accelerates the iteration of the technology.

image.png

To further promote the application of this technology, all model weights of BitCPM-CANN have been open-sourced, and users can obtain them through the HuggingFace and ModelScope platforms. This provides developers with a highly promising tool, encouraging more innovative applications in the AI field.

In summary, the release of BitCPM-CANN marks a solid step forward for China in the field of AI large model training, paving the way for future intelligent applications.