@Google DeepMind Blog
//
References:
Google DeepMind Blog
, PCMag Middle East ai
,
Google has integrated its Veo 2 video-generating AI model into Gemini Advanced, offering subscribers the ability to create short video clips directly from text prompts. This new feature allows users to generate eight-second, 720p videos in a 16:9 aspect ratio, suitable for sharing on platforms like TikTok and YouTube. These videos can be downloaded as MP4 files and include Google's SynthID watermark, helping to identify them as AI-generated content. The availability is currently limited to Google One AI Premium subscribers and does not extend to Google Workspace business or educational plans.
This enhancement positions Google to compete with other AI video generation platforms, such as OpenAI's Sora. The company emphasizes that Veo 2 delivers "detailed videos with cinematic realism" by better understanding real-world physics and human motion. Users can input a variety of prompts, from realistic nature scenes to stylized or surreal sequences, with more detailed prompts generally yielding better results. Although clips are now free to produce for Advanced plan subscribers, Google has indicated that there will be a monthly video generation limit. In addition to Gemini Advanced, Veo 2 is also being added to Whisk, an experimental tool in Google Labs, under the feature "Whisk Animate." This new function transforms uploaded images into animated video clips, also using the Veo 2 model. As with Gemini, the output is restricted to eight seconds and is only accessible to Premium subscribers in the US. These integrations represent Google's continued investment in AI capabilities and its commitment to providing innovative tools for its users. Recommended read:
References :
@Google DeepMind Blog
//
Google is expanding its AI video generation capabilities by integrating Veo 2, its most advanced generative video model, into the Gemini app and the experimental Whisk platform. This new functionality allows users to create short, high-resolution videos directly from text prompts, opening up new avenues for creative expression and content creation. Veo 2 is designed to produce realistic motion, natural physics, and visually rich scenes, making it a powerful tool for generating cinematic-quality content.
Currently, access to Veo 2 is primarily available to Google One AI Premium subscribers, who can generate eight-second, 720p videos in MP4 format within Gemini Advanced. The Whisk platform also incorporates Veo 2 through its "Whisk Animate" feature, enabling users to transform uploaded images into animated video clips. Google emphasizes that more detailed and descriptive text prompts generally yield better results, allowing users to fine-tune their creations and explore a wide range of styles, from realistic nature scenes to stylized and surreal sequences. To ensure responsible AI development, Google is implementing several safeguards. All AI-generated videos created with Veo 2 will feature an invisible watermark embedded using SynthID technology, helping to identify them as AI-generated. Additionally, Google is employing red-teaming and review processes to prevent the creation of content that violates its policies. These new video generation features are being rolled out globally and support all languages currently available in Gemini, although standard Gemini users do not have access at this time. Recommended read:
References :
@the-decoder.com
//
Nvidia is making significant advancements in artificial intelligence, showcasing innovations in both hardware and video generation. A new method developed by Nvidia, in collaboration with Stanford University, UCSD, UC Berkeley, and UT Austin, allows for the creation of AI-generated videos up to one minute long. This breakthrough addresses previous limitations in video length, where models like OpenAI's Sora, Meta's MovieGen, and Google's Veo 2 were capped at 20, 16, and 8 seconds respectively.
The key innovation lies in the introduction of Test-Time Training layers (TTT layers), which are integrated into a pre-trained Transformer architecture. These layers replace simple hidden states in conventional Recurrent Neural Networks (RNNs) with small neural networks that continuously learn during the video generation process. This allows the system to maintain consistency across longer sequences, ensuring elements like characters and environments remain stable throughout the video. This new method has even been showcased with an AI-generated "Tom and Jerry" cartoon. Furthermore, Nvidia has unveiled its new Llama-3.1 Nemotron Ultra large language model (LLM), which outperforms DeepSeek R1 despite having less than half the parameters. The Llama-3.1-Nemotron-Ultra-253B is a 253-billion parameter model designed for advanced reasoning, instruction following, and AI assistant workflows. Its architecture includes innovations such as skipped attention layers, fused feedforward networks, and variable FFN compression ratios. The model's code is publicly available on Hugging Face, reflecting Nvidia's commitment to open-source AI development. Recommended read:
References :
Kara Sherrer@eWEEK
//
Runway AI Inc. has launched Gen-4, its latest AI video generation model, addressing the significant challenge of maintaining consistent characters and objects across different scenes. This new model represents a considerable advancement in AI video technology and improves the realism and usability of AI-generated videos. Gen-4 allows users to upload a reference image of an object to be included in a video, along with design instructions, and ensures that the object maintains a consistent look throughout the entire clip.
The Gen-4 model empowers users to place any object or subject in different locations while maintaining consistency, and even allows for modifications such as changing camera angles or lighting conditions. The model combines visual references with text instructions to preserve styles throughout videos. Gen-4 is currently available to paying subscribers and Enterprise customers, with additional features planned for future updates. Recommended read:
References :
Kara Sherrer@eWEEK
//
Runway AI Inc. has launched Gen-4, a new AI model for video generation designed to address a significant limitation in AI video creation: character consistency across scenes. The New York-based startup, backed by investments from tech giants such as Nvidia and Google, aims to transform film production with this new system, which introduces character and scene consistency across multiple shots. This capability has been elusive for most AI video generators until now, potentially opening new avenues for Hollywood and other creative industries.
Gen-4 allows users to upload a reference image of an object or character and then generate videos where that element retains a consistent look throughout the entire clip. The model combines visual references with text instructions to preserve styles throughout videos, even as details like camera angle or lighting conditions change. Initially, users can generate five- and ten-second clips, but Runway's demo videos hint at future updates that could allow for more complex, longer-form content creation. This technology could also function as an image editing tool, allowing users to combine illustrations and generate multiple variations to streamline the revision process. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |