AI Daily: ByteDance Open-Sources Unified Multimodal Large Model Lance 3B; Zhipei Launches GLM-5.1 High-Speed Version; CapCut Collaborates with Gemini for Deep Integration

Welcome to the "AI Daily" section! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications.

Hot AI products Click to learn more:https://app.aibase.com/zh

1. ByteDance Open Sources Lance 3B: A Single "Brain" That Handles Both Visual and Text Understanding and Generation

ByteDance has open-sourced its native unified multimodal large model Lance, achieving full functionality with 3B parameters, breaking through the technical barriers between understanding models and generation models. Lance realizes the unification of image, video understanding, generation, and cross-modal editing through shared context and capability decoupling parallel design.

AiBase Highlights:
✨ Lance uses shared context and capability decoupling parallel design to unify multimodal tasks.
🚀 Achieves full functionality with 3B parameters, breaking through traditional technical barriers.
🔧 Open-sourced under the Apache 2.0 license, runs with low-cost computing power, reducing deployment costs.

2. Zhipu Releases GLM-5.1 High-Speed Version: 400 Tokens/s Sets a New Global API Speed Limit

Zhipu released the high-speed version of GLM-5.1 API, setting a new global speed limit for large model APIs at 400 tokens/s, achieving flagship-level full-scale capabilities and ultra-low latency simultaneously. Through system-level engineering optimization, it improves model performance and promotes the efficient development of AI applications.

AiBase Highlights:
🧠 Zhipu's GLM-5.1 high-speed API achieves an output speed of 400 tokens/s, setting a new global speed limit for large model APIs.
🚀 Achieves flagship-level full-scale capabilities and ultra-low latency, breaking industry conventions.
🔧 Through system-level engineering optimization, including collaborative optimization of the inference engine, scheduling system, and infrastructure layer, it enhances model performance.

3. CapCut Collaborates with Gemini to Launch Deep Integration: AI Creation Tools Achieve Intelligent Interconnection

CapCut has collaborated with Google Gemini App, allowing users to directly access CapCut's advanced creative and editing features within the Gemini app, further promoting the popularization and innovation of AI tools in content creation.

AiBase Highlights:
🚀 CapCut collaborates with Google Gemini App, enabling users to directly access CapCut's advanced creative and editing features within the Gemini app.
💡 This collaboration aims to create a smoother and more efficient AI creation experience, reducing the cost of switching between apps.
🌟 CapCut states that future creation methods will be more conversational and intuitive, and achieve intelligent integration.

4. OpenAI Launches ChatGPT for PowerPoint: One Sentence Generates PPT, Also Actively Identifies Bugs

OpenAI launched the ChatGPT for PowerPoint plugin, allowing users to quickly generate and optimize PPT content with simple instructions, while also having smart analysis and modification functions, greatly improving office efficiency.

AiBase Highlights:
✨ Zero threshold and free for all global users to experience the ChatGPT for PowerPoint plugin.
💡 Supports creating a PPT from scratch, one-click modification/polishing of pages, and even "reviewing" the plan.
🔒 Introduces key operation confirmation mechanism to ensure every modification is controllable.

5. WordPress 7.0 Released: Native AI Integration Enters a New Era of Intelligent Website Building

WordPress 7.0 was officially released, integrating AI natively, marking the entry of website building into an intelligent stage. The new version made comprehensive upgrades in content creation, backend interface, and mobile experience, providing users with a more efficient and smooth website building and editing experience.

AiBase Highlights:
🧠 Integrates AI natively to improve content creation efficiency.
🎨 Modernized backend interface to optimize user experience.
📱 Enhanced mobile customization features to improve responsive editing capabilities.

6. Spotify Joins Universal Music to Launch AI Covers and Remixes: The "Downsizing" of Copyrighted Content Is Here

Spotify partnered with Universal Music to launch AI cover and remix features, marking a major transformation in the music copyright field. This feature is based on legal authorization, offering users a new way of creation and ensuring artists' interests through a reasonable revenue-sharing mechanism. This move not only enhances Spotify's market competitiveness but also poses a strong challenge to other AI music platforms.

AiBase Highlights:
🎧 Spotify and Universal Music have reached an agreement on AI covers and remixes, providing fans with legal creation tools.
⚖️ Emphasizes the "informed consent, attribution, and fair compensation" golden three principles, distinguishing from other AI platforms' infringement models.
📈 Spotify's stock price surged by 13% due to its AI strategy, demonstrating its strong influence in the music copyright field.

7. 400 Tokens/s Breaks Global Records! Zhipu Joins TileRT to Launch GLM-5.1 High-Speed API

Zhipu launched the GLM-5.1 high-speed API, breaking the global record with an output speed of 400 tokens/s, and achieved a combination of flagship-level model capabilities and ultra-low latency, bringing revolutionary improvements to scenarios such as AI programming and real-time interaction.

AiBase Highlights:
🚀 GLM-5.1 high-speed API outputs 400 tokens/s, breaking the global speed limit for large model APIs.
🧠 First time in domestic large models to achieve a combination of flagship-level model capabilities and ultra-low latency.
🔧 TileRT high-performance inference engine significantly reduces tail latency in high-concurrency scenarios through system-level optimization.

8. No Rehearsal, Go Live for Real! Meituan LongCat-Video-Avatar 1.5 Open Sourced: Fully Outperforms Mainstream Closed-Source Models

The Meituan LongCat large model team officially open-sourced the commercial-grade digital human video generation model LongCat-Video-Avatar 1.5. This version has achieved significant improvements in lip synchronization, physical plausibility, and long-video stability, and through multiple technological upgrades, it has significantly enhanced the commercial value and user experience of the model.

AiBase Highlights:
🧠 The model upgraded the audio feature extraction encoder from Wav2Vec2 to Whisper-large, enhancing the ability to capture phonetic changes and pronunciation rhythm.
🔄 Introduced GRPO technology to optimize hand and continuity alignment, solving issues like hand distortion and discontinuous movements.
🚀 Adopted DMD technology, increasing inference efficiency by 15 times, generating a 10-second video in about 1 minute.
More details: https://github.com/meituan-longcat/LongCat-Video

AI Daily: ByteDance Open-Sources Unified Multimodal Large Model Lance 3B; Zhipei Launches GLM-5.1 High-Speed Version; CapCut Collaborates with Gemini for Deep Integration

Related Recommendations

Doubao and Tongyi Qianwen Disable AI Personification Features: Industry Contraction as New Regulations Take Effect on July 15th

ByteDance Douba AI Phone Project Changes: Hardware Leader Resigns, Project Enters Adjustment Period

New Start-up from Former Indian IT Giant Aims to Revolutionize the IT Services Industry with AI

Google Releases Gemini 3.5 Flash with Native Integration of Computer Usage Tools, Replacing the 2.5 Framework

Google Gemini 3.5 Pro Release Delayed, Refining Core Capabilities Becomes the Top Priority