SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models


Andrej Karpathy used AI to automatically score 930 Hacker News discussions from 2015, demonstrating AI's ability to analyze historical public discourse and prompting reflection on future online discussion quality.....
Starcloud company successfully trained the nano-GPT model and completed the Gemma model inference using satellites equipped with NVIDIA H100 GPUs in space, marking an important advancement in the development of space data centers.
December 6th to 7th, the 10th Advanced Forum on Language Services was held at Guangzhou University. During the event, the Cantonese Corpus Construction and Large Model Evaluation Lab launched the AI-DimSum Multimodal Cantonese Corpus Platform, aiming to break through the digital challenges of Cantonese as a low-resource language. The platform is centered around the needs of digital Chinese construction and the digitalization of the Greater Bay Area culture, building a multimodal corpus to promote the protection and development of Cantonese in the era of artificial intelligence.
AWS unveils four self-developed 'Nova2' AI models at re:Invent 2025, covering text, image, video, and speech with built-in web search and code execution, claiming leading price-performance. Nova2 Lite offers cost-effective inference, outperforming Claude Haiku4.5 and GPT-5Mini at about half the cost, while Nova2 Pro targets complex agent tasks.....
vLLM-Omni is a multimodal inference framework supporting text, image, audio, and video inputs/outputs, designed to streamline multimodal reasoning and empower next-generation full-modal models.....