Welcome to the "AI Daily" section! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technology trends and innovative AI product applications.

Fresh AI products Click to learn more:https://app.aibase.com/zh

1. AisTech launches the world's first general real-time world model PixVerse R1, up to 1080P quality

AisTech launches the world's first general real-time world model PixVerse R1, achieving real-time interactive experiences in virtual worlds through three core technologies, and expanding new possibilities for "everyone can create" in fields such as games, films, and live streaming.

image.png

AiBase Highlights:

🧠 Omni native multimodal model serves as the "computational foundation" of the real world, unifying multimodal content into a continuous token stream, generating a digital world with consistent physical logic.

🔄 Autoregressive streaming generation mechanism solves long-term sequence consistency issues, enabling "streaming interaction" in storytelling.

⚡ Instant response engine IRE improves computing efficiency, supporting the core experience of "instant response."

2. Vidu launches AI one-click MV generation feature, creating a minute-level "virtual production house"

Vidu launches an AI one-click MV generation feature, marking the entry of video creation into a fully automated end-to-end generation era. Users only need to provide background music, reference images, and text instructions to output high-quality MVs within minutes. This feature achieves full-process automation through a multi-agent system, significantly lowering the barrier for professional video creation and providing creators with an integrated virtual production house experience.

image.png

AiBase Highlights:

🎬 Fully automated multi-agent collaboration: The system includes four intelligent agents—director, storyboard, visual generation, and editing—to achieve full-process automation from music analysis to final output.

🖼️ Industrial-level style consistency: Supports up to seven reference images for positioning, ensuring that character and scene styles do not drift over a five-minute video.

🎵 Accurate audio-visual synchronization: AI can automatically identify background music rhythm and complete transitions while generating frame-by-frame synchronized dynamic subtitles, delivering a finished product within minutes.

3. New standard for coding agents! MiniMax releases OctoCodingBench benchmark test

The article introduces MiniMax's open-source benchmark test, OctoCodingBench, designed to evaluate the ability of programming agents to follow instructions in code repository environments. This benchmark test provides a multidimensional evaluation framework by testing how well agents follow seven different instruction sources, using a binary checklist scoring mechanism to make the evaluation results more accurate. In addition, OctoCodingBench supports multiple scaffold environments, such as Claude Code, Kilo, and Droid, which are tools used in actual production environments.

image.png

AiBase Highlights:

🧠 Evaluate the ability of programming agents to follow instructions

📊 Provide a multidimensional evaluation framework

🔧 Support multiple scaffold environments

More details: https://huggingface.co/datasets/MiniMaxAI/OctoCodingBench

4. Kuaishou announces that Clever AI ARR reaches $240 million, with monthly revenue exceeding $20 million in December

Kuaishou Technology announced that Clever AI achieved a monthly revenue of over $20 million in December 2025, with an annualized recurring revenue (ARR) of $240 million, showcasing its strong growth momentum in the generative AI market.

image.png

AiBase Highlights:

🚀 Clever AI's monthly revenue exceeds $20 million, with an annualized recurring revenue (ARR) of $240 million.

🛠️ Technological intensive iteration, launching multiple models to improve professional creative efficiency.

🌍 Serving over 60 million users globally, commercialization has been implemented across multiple fields.

5. Domestic computing power + independent architecture! Zhipu collaborates with Huawei to open-source GLM-Image, the first multi-modal SOTA model to run fully on Ascend chips

Zhipu AI and Huawei jointly open-source GLM-Image, which achieves international leading performance and sets a global record for the first full-process multi-modal large model based on domestic AI chips. It uses a hybrid architecture of autoregressive + diffusion decoder to achieve deep alignment and joint reasoning of text and image semantics, driving AIGC from "pixel stacking" to "semantic-driven."

image.png

AiBase Highlights:

🧠 Hybrid architecture of autoregressive + diffusion decoder, achieving deep alignment and joint reasoning of text and image semantics.

🚀 Full process completed on domestic AI chips, breaking dependence on foreign GPUs.

🌐 Driving AIGC from "pixel stacking" to "semantic-driven."

More details: https://github.com/zai-org/GLM-Image

6. Global first medical large model Baichuan-M3 makes its debut: surpasses GPT-5.2, impressive capabilities!

The domestically developed medical large model Baichuan-M3 was officially released, becoming the strongest medical AI system in the world. Developed by Baichuan Intelligence, this model focuses on medical scenarios, integrating a vast amount of medical literature, clinical guidelines, real case records, and pharmaceutical knowledge bases, demonstrating remarkable smart medical capabilities. With a parameter count of 235 billion, the core advantage of Baichuan-M3 lies in its extremely low hallucination rate, meaning that when conducting medical consultations and providing medication advice, Baichuan-M3 not only has high accuracy but also effectively avoids the spread of incorrect information. According to evaluation results, the model outperforms OpenAI's GPT-5.2 in consultation ability and medical accuracy and performs better than human doctors in all assessments. Wang Xiaochuan, founder of Baichuan Intelligence, stated that the release of Baichuan-M3 will promote the co-building of the medical AI ecosystem. Its open-source strategy will encourage more developers to participate in medical AI innovation, aiming to implement applications in areas such as grassroots medicine, auxiliary diagnosis, and health management. Currently, Baichuan-M3 is available on the Bai Xiao Ying platform for user experience, allowing users to obtain medication guidance and other medical-related assistance through this platform. This innovation not only provides patients with a more convenient medical consultation channel but also offers powerful support for doctors' work. As medical AI technology develops, models like Baichuan-M3 will be increasingly applied in the medical field, and in the future, they have the potential to further enhance the quality and efficiency of medical services, benefiting more people.

image.png

AiBase Highlights:

🧠 Baichuan-M3 medical large model has 235 billion parameters, featuring an ultra-low hallucination rate to ensure the accuracy of medical consultations and medication advice.

🏥 Baichuan-M3 outperforms GPT-5.2 in consultation ability and medical accuracy, surpassing human doctors in all evaluations.

🌐 Baichuan Intelligence's open-source strategy encourages developers to participate in medical AI innovation, promoting the construction of the medical AI ecosystem.

7. Google redefines the future of e-commerce: Launches Agentic AI shopping system, Gemini CX+UCP protocol realizes "search and buy"

Google launches the Agentic AI shopping system, combining Gemini CX and UCP protocols to realize a seamless experience from search to purchase, redefining the future of e-commerce.

image.png

AiBase Highlights:

✅ Launches Agentic e-commerce solution, including UCP protocol and Gemini CX system, achieving a complete shopping cycle.

💡 Users can complete shopping tasks directly through Google search without needing to switch pages.

🌐 UCP protocol establishes a standardized communication bridge between AI agents, merchants, and e-commerce platforms, compatible with existing industry standards.

8. Google strengthens its open-source medical AI ecosystem: MedGemma 1.5 enhances medical imaging capabilities, and simultaneously launches speech-to-text model MedASR

Google releases the new open-source medical large model MedGemma 1.5 and the speech-to-text model MedASR, further enhancing its technical stack in the medical vertical field. MedGemma 1.5 enhances the understanding and analysis of medical images, evolving from a pure text Q&A tool to a multimodal clinical decision support system. MedASR focuses on medical voice scenarios, improving the efficiency of electronic medical record entry. Both models are trained on de-identified clinical data and are released as open-source for global researchers and developers to use.

image.png

AiBase Highlights:

🧠 MedGemma 1.5 enhances medical imaging understanding and analysis, supporting multimodal clinical decision support systems.

🗣️ MedASR optimizes medical speech recognition, improving the efficiency of electronic medical record entry.

🔒 Google's open-source models follow privacy protection regulations, promoting the application of AI in grassroots medicine and research.