AI Daily: Alibaba Qwen APP Beta Test; Veo 3.1 Launches Multiple Image References; Super Xiao Ai AI Large Model for Easy Photo Editing Released

Welcome to the "AI Daily" column! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technology trends and learn about innovative AI product applications.

Fresh AI products click to learn more:https://app.aibase.com/zh

1. Alibaba Qwen APP Beta Test, Competing with ChatGPT

The article introduces Alibaba's Qwen APP, based on the Qwen3 model, aiming to compete comprehensively with ChatGPT. The beta version of Qwen APP is now available on major app stores and plans to launch an international version. In addition, the performance of Qwen3-Max has surpassed international models such as GPT5, ranking among the top three globally.

【AiBase Summary:】
🧠 Alibaba launched the Qwen APP, competing with ChatGPT based on the Qwen3 model.
🚀 The beta version of Qwen APP is now available, and an international version is planned to capture the overseas market.
📈 Qwen3-Max performance exceeds international models like GPT5, ranking among the top three globally.

2. Gemini Veo 3.1 Launches Multi-Image Reference, Synthesizing Three Elements into a Video

Google has pushed the Veo 3.1 video model to Gemini Pro/Ultra subscribers, adding an "Ingredients to Video" mode that supports uploading three reference images at once, extracting character, scene, and style features separately, and fusing them into an 8-second 1080p video. This function enhances the diversity and quality of video generation while maintaining character consistency and lighting continuity.

【AiBase Summary:】
🌟 New "Ingredients to Video" mode supports generating videos from three reference images.
🎨 Video content comes with a built-in SynthID invisible watermark for enhanced copyright protection.
🔊 Synchronous output of native ambient sound enhances video immersion.

3. Xiaomi's Super Xiaoai AI Large Model "Free Editing" Launches: One Sentence Produces Great Photos

Xiaomi released an update for version v7.8.50 of Super Xiaoai, adding the 'Free Editing' feature. Users can use natural language instructions to call the photo album AI model for automatic photo editing. This function supports global multimodal interaction, recognizing screen and camera images and performing complex operation chains.

【AiBase Summary:】
📱 Super Xiaoai added the 'Free Editing' feature, allowing users to automatically edit photos using natural language instructions.
📷 Supports global multimodal interaction, identifying screen and camera images and performing complex operation chains.
🖼️ The feature is based on a 7B multimodal large model, capable of local reasoning, default output with a watermark, and retaining a backup of the original image.

4. Xiaomi Open-Sources 7B Multimodal Model MiMo-VL, Launches AI Assistant Miloco

Xiaomi simultaneously released the 7B parameter multimodal model 'Xiaomi-MiMo-VL-Miloco-7B-GGUF' on Hugging Face and GitHub, and launched the intelligent assistant 'Xiaomi Miloco' based on this model. Miloco can recognize user activities and gestures through Mi Home cameras and automatically connect smart home devices, while being compatible with the Home Assistant protocol. The model uses a non-commercial open-source license, and users can deploy it in one click on Windows or Linux hosts equipped with NVIDIA GPU and Docker environment.

【AiBase Summary:】
🚀 Xiaomi released the 7B parameter multimodal model 'Xiaomi-MiMo-VL-Miloco-7B-GGUF'.
💡 Intelligent assistant 'Xiaomi Miloco' can recognize user activities and gestures, and connect smart home devices.
🔒 The model uses a non-commercial open-source license and supports deployment in NVIDIA GPU and Docker environments.

5. Google Flow Integrates Nano Banana Model, One-Click Image Masking for Direct Video Material Output

Google added an image editing module to its AI movie tool Flow, deeply integrating the Gemini2.5Flash image model (code name Nano Banana), supporting one-click background removal, subject separation, and scene replacement with natural language, and directly dragging into the timeline to generate 8-second dynamic shots. This feature is available for Gemini free tier and above users, priced at $0.039 per image, with enterprise-level Vertex AI also上线.

【AiBase Summary:】
🔥 Integrated Gemini2.5Flash image model for natural language-controlled image editing.
💡 Supports one-click background removal, subject separation, and scene replacement, improving video production efficiency.
🌐 Provides API batch interfaces, targeting high-output scenarios such as short videos and e-commerce posters.

6. Next-Generation Multimodal AI DeepEyesV2: Smart Tools Help Surpass Larger Models

DeepEyesV2 is a multimodal AI model developed by researchers that can analyze images, execute code, and perform web searches. It demonstrates excellent performance in multiple tasks by intelligently utilizing external tools, even surpassing larger models in some cases.

【AiBase Summary:】
🌟 DeepEyesV2 uses smart tools to enhance performance in multimodal tasks, surpassing large models.
🔧 Uses a two-stage training process combining image understanding and tool usage.
📈 Performs well on multiple benchmarks, demonstrating the potential of smaller models.
Details Link: https://arxiv.org/abs/2511.05271

7. NotebookLM Upgrades Support for Image Import, Board Notes Become Searchable Knowledge Base

Google introduced a new feature for NotebookLM, supporting users to upload blackboard notes, textbook scans, or street photography tables, and enabling natural language retrieval through OCR and semantic parsing. This feature is free for all platforms and will add local processing options in the future to protect sensitive data.

【AiBase Summary:】
📷 Supports image data sources, improving note management efficiency.
🧠 Multimodal model recognizes handwritten and printed content, extracts table structures.
🔍 Enables natural language retrieval of image content, enhancing information access capabilities.

8. JetBrains Launches AI Coding Agent Benchmark Testing Platform DPAI Arena

JetBrains launched DPAI Arena, the first open, multilingual, multi-framework, and multi-workflow AI coding agent benchmark testing platform. The platform aims to evaluate the efficiency of AI tools in software development and supports multiple programming languages and workflows, enabling fair and repeatable comparison of AI tool performance.

【AiBase Summary:】
🌟 DPAI Arena is the industry's first open AI coding agent benchmark testing platform, aimed at evaluating the efficiency of AI tools in software development.
🛠️ The platform supports multiple programming languages and workflows, enabling fair and repeatable comparison of AI tool performance.
🤝 JetBrains plans to hand over the project to the Linux Foundation to promote broader technical guidance and future development.
Details Link: https://dpaia.dev/

AI Daily: Alibaba Qwen APP Beta Test; Veo 3.1 Launches Multiple Image References; Super Xiao Ai AI Large Model for Easy Photo Editing Released

Related Recommendations

Qwen APP Deeply Integrates with the National Medical Products Administration Data, Launching Millions of Authoritative Information on Medicines and Medical Devices

Qwen APP's Grey Test of HappyHorse, A One-Tap TVB Hong Kong Style Short Film AI Video Model

AI Daily: Anthropic Launches Claude Opus 4.6; Qwen's Spring Festival Big Discount Day One is a Hit; Tencent Launches Huolong Webtoon

Qwen's Spring Festival Big Discount Day One Was a Hit: 1 Million Orders Placed in 3 Hours, Server Struggled Temporarily

Experience AI Order in One Sentence, Qwen APP Issues 10 Million Free Coupon to Help Achieve Tea Freedom