At the 28th China Beijing International Science and Technology Industry Expo, a large model capable of "understanding" Tibetan and creating content in Tibetan thinking became the focus of attention. Named "DeepZang," this large model was deeply developed by a research team from Tibet University, not only demonstrating breakthroughs in multilingual processing by AI, but also marking that minority languages are accelerating their integration into the global digital wave.
For a long time, mainstream general large models have primarily been trained with Chinese and English as core languages, often leading to issues such as heavy translation style and stiff fluency when dealing with minority languages like Tibetan. To break this deadlock, the development team collected nearly 70 million parallel Tibetan-Chinese sentences and gathered over 30,500 hours of voice data, fully covering the three major Tibetan dialect regions: U-Tsang, Kham, and Amdo.
The unique feature of this large model lies in its "original language thinking" capability. At the expo, it demonstrated high practicality: from drafting a yak transaction contract, to composing a poem praising parents, and even providing professional nutritional advice, the AI's responses were both accurate and rich in the unique cultural flavor of the Tibetan language. More notably, by combining voiceprint recognition with dialect classification technology, it effectively solved the communication difficulties caused by the significant differences in spoken Tibetan dialects. Even users with low literacy levels can interact easily through voice.
Technological advancement directly translates into improved productivity. Lobsang Dunyu, a translator working in Shannan, Tibet, said that AI-assisted translation allowed a document that previously required three people to work together and took 40 minutes to complete, now to be finished by one person in just over 20 minutes. Currently, the user base of "DeepZang" has exceeded 300,000, with more than 70% of users aged between 18 and 40, covering remote areas including Tibet, Qinghai, Sichuan, and Gansu.
Despite its impressive performance, the commercialization path of Tibetan AI still faces challenges such as high computing power costs and financial pressure. In response, relevant officials stated that participating in the expo for the first time aimed to find like-minded partners to jointly overcome challenges related to computing power and business cycles. With the improvement of 5G networks and power infrastructure throughout Tibet, Tibetan AI is expected to become a solid bridge connecting Tibetan speakers with the modern digital world in the future.
