Google Veo 3 Introduces Native Audio for Video

S.Dyema Zandria@The Tech Basic //

Google Veo 3 Introduces Native Audio for Video

Google is pushing the boundaries of AI video generation with the introduction of Veo 3, a model that now features native audio capabilities. Unveiled at Google I/O 2025, Veo 3 stands out as the first of its kind, capable of producing fully synchronized audio directly within the video output. This includes realistic dialogue, environmental background noise, and even music, making the generated videos more immersive than ever before. Google has also launched Flow, an AI filmmaking interface.

Veo 3 has been tested and can produce videos of realistic people with sound and music. Veo 3 can produce eight-second video clips at 720p resolution with matching sound effects and spoken words. To create a video, users can provide a text description or a still image, which Veo 3 then transforms into moving pictures. The model uses a diffusion method, learning from a vast dataset of real videos to generate scenes. A language model then ensures that the video accurately reflects the provided prompt, while an audio model adds sound effects and dialogue.

Google is making Veo 3 available to its Ultra subscribers through the Gemini app and Flow platform. Enterprise users can also access Veo 3 on Vertex AI. While Veo 3 initially launched for US users of AI Ultra at twelve thousand five hundred credits per month for two hundred fifty dollars, Google quickly expanded availability to seventy-one more countries outside the EU. This move underscores Google's commitment to pushing the limits of AI-generated content.

Original img attribution: https://thetechbasic.com/wp-content/uploads/2025/05/Mediaweek.webp

ImgSrc: thetechbasic.co

References :

pub.towardsai.net: TAI #154: Gemini Deep Think, Veo 3â€™s Audio Breakthrough, & Claude 4â€™s Blackmail Drama
Ars OpenForum: Google's Veo 3 delivers AI videos of realistic people with sound and music. We put it to the test.
hothardware.com: Google I/O was about a week ago, and if you haven't heard, one of Google's biggest announcements was the company's Veo 3 generative AI model for video. Gone are the days of creepy, low-quality clips that vaguely look like Will Smith eating spaghetti and don't traverse the uncanny valley very well. Veo 3 is more than capable of generating that
The Tech Basic: Google Veo 3 is a new tool that makes eight-second video clips at 720p resolution with matching sound effects and spoken words. It takes a text description or a still image and turns it into moving pictures. It uses a method called diffusion to learn from real videos that it saw during training.

Classification:

HashTags: #Veo3 #AIVideoGeneration #GoogleAI
Company: Google
Target: Content creators
Product: Veo 3
Feature: Audio Generation
Type: AI
Severity: Informative

News from the AI & ML world

DeeperML

Google Veo 3 Introduces Native Audio for Video

Classification: