Google's flagship video generation model Veo3 is now available to developers through the Gemini API, offering text-to-video functionality and supporting synchronized audio generation. This move marks a new stage in AI video production, but it also comes with higher costs.

Veo3 is Google's first model capable of generating high-resolution videos from a single text prompt and synchronizing dialogue, music, and sound effects. Currently, the Gemini API is limited to text-to-video functionality, but Google has stated that image-to-video support already available in the Gemini app will be launched soon.

QQ20250718-085316.png

For developers who want to integrate advanced video generation features into their applications or build production-ready prototypes, this API integration provides significant support. Google AI Studio offers SDK templates and starter apps to help developers get started quickly. Accessing this API requires an active Google Cloud project with billing enabled. Google revealed that Veo3 has been used millions of times within the Gemini app, Flow, and Vertex AI.

However, Veo3 is one of the more expensive options in the AI video generation field. Accessing Veo3 via the Gemini API is only available in Google Cloud paid plans. For example, a 720p, 24fps video (with 16:9 audio format) is priced at $0.75 per second, which is 25 cents more than Veo2 without audio. This means an eight-second video would cost $6, and a five-minute video would cost $225. Considering that multiple attempts are usually needed to achieve the desired results, the actual cost can increase rapidly. For instance, if ten times the amount of material is needed to produce a five-minute usable video, the total cost could reach $2,250. Nevertheless, Google may believe that, in certain use cases, this is still more cost-effective than traditional video production. Google also announced the "Veo3Fast" mode, which is faster and cheaper, but it is not yet available in the API.

Currently, Veo3's applications are mainly focused on professional fields. For example, Cartwheel uses Veo3 to convert 2D videos into realistic 3D character animations and map the generated actions onto the assembly models of customer projects. The game studio Volley