G42, an artificial intelligence group based in Abu Dhabi, officially launched NANDA87B on Tuesday, an open-source Indian-English large language model with 8.7 billion parameters, marking an upgrade from its earlier version NANDA. NANDA87B is now available on the Hugging Face page of Mohamed bin Zayed University for Artificial Intelligence (MBZUAI) with open weights, allowing developers, creators, and businesses to freely use and expand its functionality.

This model was developed by MBZUAI in collaboration with Inception, a company under G42, and chip manufacturer Cerebras. NANDA87B is built upon the Llama-3.170B model and was trained on over 65 billion Indian language tokens, using an Indian-specific tokenizer to improve training and inference efficiency.

Manu Jain, CEO of G42 India, said: "India should have world-class technology that can speak its language. NANDA87B is an important step toward this goal." He added that the model is designed to support innovation in multiple areas of the Indian AI ecosystem, including education, entertainment, and enterprises.

G42 stated that NANDA87B is designed to handle formal Hindi, everyday spoken language, and hybrid Hindi (Hinglish), and can perform tasks such as translation, summarization, instruction following, and transliteration. The company also emphasized that safety and cultural consistency were considered during the model's design to ensure responsible output.

Richard Morton, Executive Director of the Foundation Models Institute at Mohamed bin Zayed University for Artificial Intelligence, said that this release marks a significant advancement in expanding access to advanced language technologies. "NANDA represents an important milestone in providing high-quality, open-access language technology for one of the largest language communities in the world," he said.

The training of NANDA87B was conducted on the Condor Galaxy supercomputing system, jointly developed by G42 and Cerebras.

Key Points:

🌟 NANDA87B is an open-source Indian-English language model with 8.7 billion parameters launched by G42, aimed at promoting technological development in India.

💻 The model supports formal Hindi, everyday spoken language, and hybrid Hindi, and can perform various language processing tasks.

🔍 This release marks an expansion of access to advanced language technologies, driving technological progress in the world's largest language community.