Welcome to the 【AI Daily】 column! This is your guide to exploring the world of artificial intelligence every day. Here, we present you with the latest highlights in the AI field daily, focusing on developers to help you gain insight into technological trends and understand innovative AI product applications.

Fresh AI products click to learn more:https://top.aibase.com/

1. Tencent Opens Source MixNerf 3D 2.1 Large Model

MixNerf 3D 2.1, as the first fully open-source industrial-grade 3D generation large model, significantly improves geometric generation quality and PBR material generation capabilities, reduces development thresholds, and is applicable across multiple industry fields.

image.png

【AiBase Summary:】

✨ The first fully open-source industrial-grade 3D generation large model with significant improvements in geometric generation and PBR material generation effects.

🎮 Supports high-quality 3D character, props, and product model generation for gaming, film, e-commerce, and other areas, eliminating the traditional 'plastic feel'.

💻 Fully open-source and deployment-friendly, compatible with consumer-level GPUs, suitable for quick onboarding by individuals and teams.

Details link: https://3d-models.hunyuan.tencent.com/

2. OpenAI Codex Upgraded: Easier Access to Ideal Code for Programmers

OpenAI Codex has undergone a major update, improving development efficiency by generating diverse code versions, optimizing details like loading progress viewing and cancellation operations, and supporting complex task handling, helping developers focus on innovation.

image.png

【AiBase Summary:】

✨ Codex now features a new function to generate various code versions, meeting different needs and enhancing development efficiency.

🔧 Optimizes details such as loading progress viewing, cancellation operations, and installation issue fixes, making operations more flexible.

🌟 Based on the codex-1 model optimization, it improves code generation accuracy and supports GitHub code repository extraction.

3. Li Hang, Head of ByteDance's AI Lab, Steps Down, Marking an Adjustment Period for Seed Team

Li Hang, head of ByteDance's AI Lab, has stepped down to take up a labor/consultant role, signaling a major personnel adjustment within ByteDance’s core AI team. With the addition of Wu Yonghui and Zhu Wenjia and team restructuring, ByteDance's strategic direction in the AI field is gradually becoming clear.

image.png

【AiBase Summary:】

Li Hang stepping down from his position as head of AI Lab, transitioning to a labor/consultant role, marks a significant personnel adjustment at ByteDance's AI Lab.

Since 2020, AI Lab has gradually transformed into a technical hub, and since 2023-2024, part of its large model team has been integrated into the Seed team.

ByteDance's AI Lab, established in 2016, has undergone leadership changes and has gradually become a technical core supporting multiple businesses within ByteDance.

4. Microsoft Releases 700 Real AI Cases, Exploring New Intelligent Work Models

Microsoft showcased 700 AI application cases covering multiple industries, using AI technology to improve enterprise efficiency, optimize work experiences, and enhance customer satisfaction.

image.png

【AiBase Summary:】

🌍 Worldwide, Microsoft showcased 700 AI application cases covering finance, healthcare, education, and other sectors.

🤖 AI agents automate tasks, significantly reducing working time and boosting enterprise efficiency.

💼 Many enterprises leverage AI to enhance customer experience, driving business growth and operational optimization.

5. Microsoft AI Unveils Code Researcher: 58% Crash Resolution Rate Stuns the Industry!

I am very optimistic about this tool, Code Researcher, which significantly improves the efficiency and accuracy of system-level software maintenance through powerful semantic analysis and multi-step reasoning capabilities. As a developer, I look forward to it simplifying our workflows and reducing manual debugging time.

image.png

【AiBase Summary:】

🔍 Code Researcher is based on large language models (LLMs) and can deeply analyze code repositories and commit histories, trace crash root causes, and generate repair patches.

📈 In Linux kernel crash repair tests, Code Researcher's crash resolution rate reached 58%, far surpassing SWE-agent's 37.5%.

🌐 It applies to various large code repositories, providing efficient solutions for enterprise-level software maintenance and advancing the automation process of system-level software development.

Details link: https://www.microsoft.com/en-us/research/publication/code-researcher-deep-research-agent-for-large-systems-code-and-commit-history/

6. AI Supervisor Onboarded! Observer AI Makes Screen Automation More Efficient, Freeing Your Hands

Observer AI, an AI framework specifically designed for screen automation tools, significantly enhances operational efficiency by real-time monitoring screen content and performing intelligent analysis, solving the problem of traditional tools' efficiency bottlenecks.

image.png

【AiBase Summary:】

Screen Real-Time Recording: Observer AI captures interface changes with high precision, ensuring no data is missed.

AI Intelligent Analysis: Built-in advanced algorithms quickly parse screen content, identifying task completion or potential issues.

Automation Response: Supports invoking MCP or custom schemes to automatically execute the next operation, achieving closed-loop automation.

Details link: https://github.com/Roy3838/Observer

7. Genspark AI Launches Revolutionary AI Browser, Opening the Era of Smart Web Browsing

Genspark AI Browser is a new type of browser integrating advanced AI technologies, enhancing user productivity through automation and intelligence. It features an embedded AI agent, offering ad-free and ultra-fast browsing experiences, and supports modular extensions. This browser shows great potential in academic research, business decision-making, and content creation.

image.png

【AiBase Summary:】

🌟 Genspark AI Browser embeds an AI agent, offering intelligent navigation and content analysis, such as automatically searching for the lowest price online.

💻 Supports MCP Store modular extensions, allowing users to meet diverse needs with customized AI tools.

🚀 Applicable in various scenarios including academic research, business decision-making, and content creation, enhancing information processing and task automation efficiency.

8. MIT Uses AI Technology to Quickly Repair a 15th-Century Masterpiece in Just Three and a Half Hours

MIT has developed an innovative restoration technique based on artificial intelligence, significantly shortening the restoration time of artworks through detachable masks and digital maps, thereby improving restoration efficiency.

image.png

【AiBase Summary:】

🎨 MIT develops new technology to restore masterpieces using AI, completing the process in just three and a half hours.

⏳ This technology shortens the restoration time from months to hours, greatly improving efficiency.

🖼️ Utilizing detachable masks and digital maps, the restoration process is safe and reversible, protecting the original artwork.

9. Ant Group and Inclusion AI Jointly Release Ming-Omni: The First Open-Source Multimodal GPT-4o

Ming-Omni is a multimodal model jointly launched by Ant Group and Inclusion AI, capable of processing images, text, audio, and video. It supports voice and image generation, multimodal input fusion processing, and is open source to promote research and development.

image.png

【AiBase Summary:】

🌟 Supports multimodal input fusion processing without additional models or specific task fine-tuning, efficiently completing diverse tasks.

🗣️ Provides voice and image generation functions, supports dialect understanding, voice cloning, and context-aware dialogue, enhancing human-machine interaction experience.

🌐 The first open-source multimodal model rivaling GPT-4o, inspiring community research and development, and promoting technological progress.

Details link: https://lucaria-academy.github.io/Ming-Omni/

10. Video-Based AI Dressing Framework MagicTryOn, Based on Wan2.1 Video Model

MagicTryOn is a virtual try-on framework based on a large video diffusion transformer, excelling in dynamic scenes with innovative model design and clothing retention strategies, significantly improving the spatiotemporal consistency of video virtual try-ons.

image.png

【AiBase Summary:】

🌟 MagicTryOn uses diffusion transformers, significantly improving the spatiotemporal consistency of video virtual try-ons.

👗 Introduces a coarse-to-fine clothing retention strategy, enhancing clothing detail representation.

🎥 Performs excellently in dynamic motion scenes, showcasing natural interactions between clothing and body movements.

Details link: https://vivocameraresearch.github.io/magictryon/

11. ByteDance's Seaweed APT2震撼发布! Real-Time Interactive AI Video Generation, Unlocking a New Era of 3D Virtual Worlds

ByteDance's released Seaweed APT2 is an efficient AI video generation model with real-time video stream generation, interactive camera control, and virtual human generation capabilities, seen as a significant step toward the virtual holodeck.

image.png

【AiBase Summary:】

✨ Seaweed APT2 adopts self-regressive adversarial post-training technology, significantly reducing computational complexity and enabling efficient real-time video generation.

🎥 Supports real-time 3D world exploration and interactive virtual human generation, applicable in scenarios such as virtual anchors and game characters.

🌟 Compared to traditional models, Seaweed APT2 shows significant improvements in action coherence and scene diversity, opening a new chapter in AI video generation.

12. OpenAI Upgrades ChatGPT Search Functionality, Providing More Precise and Intelligent Responses

I am very optimistic about this upgrade of ChatGPT Search functionality. It not only improves search quality but also enhances user experience, particularly with the newly added image search and project management functions, making ChatGPT more powerful and practical.

image.png

【AiBase Summary:】

🔍 Adds image search functionality, supporting diverse interaction methods.

📚 Project management function upgrades, helping efficiently manage conversations and documents.

🌐 Challenges Google's dominant position, providing a more efficient and user-friendly search experience.

13. ByteDance's Volcano Engine Clarifies Rumors About Collaborating with Lao Feng Xiang on AI Smart Glasses

This article explores rumors about ByteDance's Volcano Engine collaborating with China's jewelry brand Lao Feng Xiang to develop AI smart glasses, analyzing both parties' statements and actual displayed functionalities.

image.png

【AiBase Summary:】

Volcano Engine denies collaborating with Lao Feng Xiang to develop AI smart glasses, but the glasses shown by Lao Feng Xiang indeed use the DouBao large model.

Lao Feng Xiang AI glasses are specially designed for elderly users, featuring various practical functions such as voice navigation and real-time translation.

The DouBao large model is a publicly available product, and any compliant customers can purchase and apply it to their own devices.