On December 16, Alibaba announced the launch of the new generation Wanxiang 2.6 Series Model, which has been comprehensively upgraded for professional film production and image creation scenarios, and is called **"the most feature-rich video generation model in the world" . Wanxiang 2.6 has been launched simultaneously on Alibaba Cloud BaiLian and Wanxiang official website**.

The biggest highlight of the Wanxiang 2.6 series is that it is the first video model in China to support the "role-playing" function, and also supports audio-visual synchronization, multi-shot generation, and voice-driven functions.

 Core Upgrades and Technological Breakthroughs

This upgrade further enhances image quality, sound effects, and instruction following, and increases the single video duration to the highest 15 seconds in China. This model family now supports more than 10 visual creation capabilities, including text-to-image, image editing, text-to-video, image-to-video, voice-to-video, action generation, role-playing, and general video editing.

1. Role-Playing Function (First in China):

Wanxiang 2.6 can refer to the appearance and voice of characters in the input video, and generate videos with single or multiple characters, or characters interacting with objects, according to the prompt. In terms of model structure, Tongyi Wanxiang integrates multiple innovative technologies, which can perform multimodal joint modeling and learning on the reference video, extracting emotional state, posture, visual features, as well as acoustic features such as voice and speaking speed, ensuring consistency and transfer across all sensory dimensions.

2. Professional-Level Shot Control:

The model now includes a shot control function, which can convert simple user prompts into multi-shot scripts, generating coherent narrative videos with multiple shots. Through high-level semantic understanding, Wanxiang 2.6 can build professional-level multi-shot segments with complete storylines and narrative tension, while maintaining high consistency of the core subject, scene layout, and environmental atmosphere during smooth transitions between shots.

Empowering Film-Level Creation Scenarios

The role-playing and shot control functions of Wanxiang 2.6 greatly meet the needs of professional film-level scenarios.

For example, an ordinary user uploads a personal video and inputs a prompt with a science fiction mystery style. Wanxiang 2.6 can complete shot design, character performance, and voice dubbing within minutes, generating a short film with complete narrative shots and cinematic camera movements, helping users realize their dream of being a movie star.

For professional scenarios such as advertising design and short drama production, by inputting continuous prompts, the model can generate a complete narrative short film, allowing everyone to become a director.

 Maintaining Domestic Leadership

Alibaba had earlier released the audio-visual synchronized video generation model Wanxiang 2.5 in September this year, and in authoritative large model evaluation sets LMArena, Wanxiang's image-to-video ranked first in China. The release of the 2.6 version further solidifies its leading position in the domestic video generation field.

Starting today, anyone can experience Wanxiang 2.6 directly on Wanxiang official website. Enterprise users can call the API through Alibaba Cloud BaiLian. It is reported that Qianwen APP