Google DeepMind's top AI video generation model, Veo 3.1, has undergone a major update, with the core focusing on the comprehensive optimization of the "Ingredients to Video" (multi-image reference to video) feature. This upgrade significantly enhances the consistency of characters, objects, textures, and backgrounds, while also introducing native vertical output and professional-grade 4K upscaling capabilities, elevating AI video from a "toy-level demonstration" to a practical production tool. The AIbase editing team has compiled reports based on the official blog and latest updates, providing professional insights for creators and developers.

QQ20260116-111828.jpg

 Comprehensive Evolution of Ingredients to Video: Consistency and Expressiveness Both Exceed Expectations

The "Ingredients to Video" feature in Veo 3.1 allows users to upload up to three reference images (character, background, texture/object), combined with brief prompts to generate dynamic videos. The latest update significantly strengthens visual consistency: character identity remains stable across different scenes, objects, backgrounds, and materials can be seamlessly reused, avoiding common issues such as "face breakdown," "object change," or "scene drifting." Even with minimal prompts, it can produce more expressive actions, natural dialogue, and smooth storytelling, greatly reducing iteration costs.

This improvement makes AI video more suitable for narrative short films, allowing users to easily achieve a "consistent main character across multiple scenes" effect, enhancing creative freedom and professionalism simultaneously.

 Native Vertical Output: Perfectly Adapts to the Short Video Era

Targeting the mobile-first content ecosystem, Veo 3.1 now supports native 9:16 vertical (portrait mode) generation within the "Ingredients to Video" feature for the first time, eliminating the need for post-production cropping or stretching, directly adapting to platforms like YouTube Shorts, TikTok, and Instagram Reels. The generated results maintain full-screen lossless quality, completely solving the quality loss issues caused by horizontal-to-vertical conversion. This feature is gradually being rolled out in Gemini app, YouTube Shorts, and YouTube Create app, greatly benefiting short video creators.

 1080p + 4K Upscaling: Professional-Level Image Quality at a Glance

The video resolution has made a breakthrough: the model generates at a base of 720p, but through Google's advanced upscaling technology, it can output sharper and cleaner 1080p (suitable for editing and post-production), as well as a new 4K resolution (suitable for large-screen playback and high-fidelity production). The 4K upscaling is currently available on Flow, Gemini API, and Vertex AI platforms, catering to professional and enterprise users, offering support for a "high-fidelity production workflow."

AIbase Comment: Veo 3.1's latest update precisely addresses two core pain points of AI video—consistency and adaptability. Stable characters/objects, native vertical format, and 4K output make the tool transition from an experimental phase to commercial viability. Currently, the "material to video" (i.e., Ingredients to Video) feature is supported in Flow with the Veo 3.1-Fast model for quick generation, and ordinary users can immediately experience it through the Gemini app (Plus/Pro/Ultra subscription).