News from the AI & ML world

DeeperML - #aivideo

Michael Nuñez@AI News | VentureBeat //
Lightricks has recently unveiled LTXV-13B, a groundbreaking AI video model that dramatically accelerates video generation on consumer-grade hardware. This innovative model utilizes "multiscale rendering" technology to achieve speeds up to 30 times faster than competing models, all without the need for expensive, high-end GPUs. The LTXV-13B represents a significant step towards democratizing AI video creation, making it accessible to a wider range of users who can now produce professional-quality AI videos on standard desktop computers and high-end laptops.

The key to LTXV-13B's efficiency lies in its multiscale rendering approach, which generates video in progressive layers of detail. This technique addresses the major challenge of high computational requirements that has historically limited AI video generation to cloud-based systems running on enterprise-grade GPUs with substantial video memory. According to Zeev Farbman, co-founder and CEO of Lightricks, the new model operates effectively within the memory constraints of consumer GPUs like the Nvidia 3090, 4090, and 5090, including their laptop versions.

Alongside this development, scientists from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and Adobe Research have introduced CausVid, a generative AI tool capable of creating smooth, high-resolution videos rapidly. CausVid employs a hybrid approach, using a diffusion model to train an autoregressive system to swiftly predict the next frame while maintaining high quality and consistency. This allows users to generate clips from simple text prompts, turn photos into moving scenes, extend existing videos, and even alter creations mid-generation, opening up new possibilities for fast, interactive content creation.

Recommended read:
References :
  • news.mit.edu: The CausVid generative AI tool uses a diffusion model to teach an autoregressive (frame-by-frame) system to rapidly produce stable, high-resolution videos.
  • AI News | VentureBeat: Lightricks unveils groundbreaking LTXV-13B AI video model that runs 30X faster than competitors on consumer hardware through innovative "multiscale rendering" technology.

@Google DeepMind Blog //
Google is expanding its AI video generation capabilities by integrating Veo 2, its most advanced generative video model, into the Gemini app and the experimental Whisk platform. This new functionality allows users to create short, high-resolution videos directly from text prompts, opening up new avenues for creative expression and content creation. Veo 2 is designed to produce realistic motion, natural physics, and visually rich scenes, making it a powerful tool for generating cinematic-quality content.

Currently, access to Veo 2 is primarily available to Google One AI Premium subscribers, who can generate eight-second, 720p videos in MP4 format within Gemini Advanced. The Whisk platform also incorporates Veo 2 through its "Whisk Animate" feature, enabling users to transform uploaded images into animated video clips. Google emphasizes that more detailed and descriptive text prompts generally yield better results, allowing users to fine-tune their creations and explore a wide range of styles, from realistic nature scenes to stylized and surreal sequences.

To ensure responsible AI development, Google is implementing several safeguards. All AI-generated videos created with Veo 2 will feature an invisible watermark embedded using SynthID technology, helping to identify them as AI-generated. Additionally, Google is employing red-teaming and review processes to prevent the creation of content that violates its policies. These new video generation features are being rolled out globally and support all languages currently available in Gemini, although standard Gemini users do not have access at this time.

Recommended read:
References :
  • The Official Google Blog: Video showcasing how you can generate videos in Gemini
  • chromeunboxed.com: Google has announced a significant upgrade to its AI video generation capabilities, integrating the powerful Veo 2 model into both Gemini Advanced and Whisk.
  • Google DeepMind Blog: Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated clips.
  • PCMag Middle East ai: A new model called DolphinGemma can analyze sounds and put together sequences, accelerating decades-long research projects. Google is collaborating with researchers to learn how to decode dolphin vocalizations "in the quest for interspecies communication."
  • www.tomsguide.com: I just tried Google's newest AI video generation features — and I'm blown away
  • blog.google: Google's DolphinGemma AI model aims to decode dolphin communication, potentially leading to interspecies communication.
  • PCMag Middle East ai: Google's Gemini Advanced now offers free 8-second video clip generation with Veo 2, and image-to-video animation with Whisk Animate.
  • www.analyticsvidhya.com: Google's new Veo 2 model lets you create cinematic-quality videos from detailed text prompts.
  • www.artificialintelligence-news.com: Google's AI model, DolphinGemma, is designed to interpret and generate dolphin sounds, potentially paving the way for interspecies communication.
  • THE DECODER: Google adds AI video generation to Gemini app and Whisk experiment
  • TestingCatalog: Perplexity adds Gemini 2.5 Pro and voice mode to web platform
  • LearnAI: Try generating video in Gemini, powered by Veo 2
  • TestingCatalog: Gemini Advanced subscribers can now generate videos with Veo 2
  • Analytics Vidhya: Designed to turn detailed text prompts into cinematic-quality videos, Google Veo 2 creates lifelike motion, natural physics, and visually rich scenes across a range of styles. Currently, Google Veo 2 is available only to users in the United States, aged 18 and […] The post appeared first on .
  • Analytics India Magazine: Google Rolls Out Video AI Model for Gemini Users, Developers
  • shellypalmer.com: Google’s Veo is Almost Here
  • www.tomsguide.com: Google rolls out Google Photos extension for Gemini — here’s what it can do
  • venturebeat.com: VentureBeat reports on Google’s Gemini 2.5 Flash introduces adjustable ‘thinking budgets’ that cut AI costs by 600% when turned down
  • eWEEK: Google’s AI Video Generator Veo 2 Delivers Cinematic Results
  • TestingCatalog: Google launches Gemini 2.5 Flash model with hybrid reasoning
  • the-decoder.com: Google is rolling out new AI-powered video generation features in its Gemini app and the experimental tool Whisk.
  • Glenn Gabe: Smart move by Google. They are offering Google One AI Premium for FREE to college students through the spring of 2026 Gives you access to 2 TB of storage and incredible AI models, like Gemini 2.5 Pro and Veo 2, via these products: *Gemini Advanced, including Deep Research, Gemini Live, Canvas, and video generation with Veo 2 *NotebookLM Plus, including five times more Audio Overviews, notebooks and more *Gemini in Google Docs, Sheets and Slides
  • bsky.app: Gemini 2.5 Pro and Flash now have the ability to return image segmentation masks on command, as base64 encoded PNGs embedded in JSON strings I vibe coded an interactive tool for exploring this new capability - it costs a fraction of a cent per image https://simonwillison.net/2025/Apr/18/gemini-image-segmentation/
  • Google DeepMind Blog: Introducing Gemini 2.5 Flash
  • www.marketingaiinstitute.com: Google Cloud just wrapped its Next ‘25 event in Las Vegas, , spanning everything from advanced AI models to new ways of connecting your favorite tools with Google’s agentic ecosystem.
  • aigptjournal.com: Google Veo 2: The Future of Effortless AI Video Creation for Everyone
  • Last Week in AI: LWiAI Podcast #207 - GPT 4.1, Gemini 2.5 Flash, Ironwood, Claude Max
  • learn.aisingapore.org: Introducing Gemini 2.5 Flash
  • Data Phoenix: Google introduces Veo 2 for video generation in Gemini and Whisk

Kara Sherrer@eWEEK //
Runway AI Inc. has launched Gen-4, its latest AI video generation model, addressing the significant challenge of maintaining consistent characters and objects across different scenes. This new model represents a considerable advancement in AI video technology and improves the realism and usability of AI-generated videos. Gen-4 allows users to upload a reference image of an object to be included in a video, along with design instructions, and ensures that the object maintains a consistent look throughout the entire clip.

The Gen-4 model empowers users to place any object or subject in different locations while maintaining consistency, and even allows for modifications such as changing camera angles or lighting conditions. The model combines visual references with text instructions to preserve styles throughout videos. Gen-4 is currently available to paying subscribers and Enterprise customers, with additional features planned for future updates.

Recommended read:
References :
  • Analytics India Magazine: Runway introduces its Next-Gen Image-to-Video Generation AI Model
  • SiliconANGLE: Runway launches new Gen-4 AI video generator
  • THE DECODER: Runway releases Gen-4 video model with focus on consistency
  • venturebeat.com: Runway Gen-4 solves AI video’s biggest problem: character consistency across scenes
  • www.producthunt.com: Product Hunt page for Runway Gen-4.
  • eWEEK: The Gen-4 model aims to solve several problems with AI video generation including inconsistent characters and objects.
  • iThinkDifferent: Runway has released Gen-4, its latest AI model for video generation. The company says the system addresses one of the biggest challenges in AI video generation: maintaining consistent characters and objects throughout scenes.
  • Charlie Fink: Runway’s Gen-4 release overshadows OpenAI’s image upgrade as Higgsfield, Udio, Prodia, and Pika debut powerful new AI tools for video, music, and image generation.

Kara Sherrer@eWEEK //
Runway AI Inc. has launched Gen-4, a new AI model for video generation designed to address a significant limitation in AI video creation: character consistency across scenes. The New York-based startup, backed by investments from tech giants such as Nvidia and Google, aims to transform film production with this new system, which introduces character and scene consistency across multiple shots. This capability has been elusive for most AI video generators until now, potentially opening new avenues for Hollywood and other creative industries.

Gen-4 allows users to upload a reference image of an object or character and then generate videos where that element retains a consistent look throughout the entire clip. The model combines visual references with text instructions to preserve styles throughout videos, even as details like camera angle or lighting conditions change. Initially, users can generate five- and ten-second clips, but Runway's demo videos hint at future updates that could allow for more complex, longer-form content creation. This technology could also function as an image editing tool, allowing users to combine illustrations and generate multiple variations to streamline the revision process.

Recommended read:
References :
  • Analytics India Magazine: Runway Introduces its Next-Gen Image-to-Video Generation AI Model
  • SiliconANGLE: Runway launches new Gen-4 AI video generator
  • THE DECODER: Runway releases Gen-4 video model with focus on consistency
  • venturebeat.com: Runway's new Gen-4 AI creates consistent characters across entire videos from a single reference image, challenging OpenAI's viral Ghibli trend and potentially transforming how Hollywood makes films.
  • eWEEK: AI Gets Cinematic: Runway’s Gen-4 Brings Film-Quality Consistency to Video Generation
  • Charlie Fink: Runway’s Gen-4 release overshadows OpenAI’s image upgrade as Higgsfield, Udio, Prodia, and Pika debut powerful new AI tools for video, music, and image generation.

Ashutosh Singh@The Tech Portal //
References: SiliconANGLE , THE DECODER , Maginative ...
Elon Musk's xAI has acquired Hotshot, a startup specializing in AI-powered video generation. Hotshot, founded by Aakash Sastry and John Mullan, has developed three video foundation models: Hotshot-XL, Hotshot Act One, and Hotshot. The move signals xAI's intention to enter the AI video generation market, potentially competing with OpenAI's Sora and Google's Veo 2.

The acquisition will see Hotshot's models scaled on xAI's supercomputer, Colossus, which utilizes a vast number of Nvidia chips. Hotshot trained its models on 600 million video clips, employing techniques like neural networks for automatic captioning and the bfloat16 data format to accelerate AI training. The company discontinued new video creation on March 14, 2025, and allowed existing users to download their content until March 30.

Recommended read:
References :
  • SiliconANGLE: XAI acquires AI video generation startup Hotshot
  • THE DECODER: Elon Musk's AI company xAI buys AI video generation startup Hotshot
  • The Tech Portal: Musk’s xAI acquires gen-AI video startup ‘Hotshot’ to compete with OpenAI’s Sora and Google’s Veo 2
  • Maginative: xAI Buys Hotshot, a Startup Working on AI-Generated Video

Emily Forlini@PCMag Middle East ai //
Google DeepMind has announced the pricing for its Veo 2 AI video generation model, making it available through its cloud API platform. The cost is set at $0.50 per second, which translates to $30 per minute or $1,800 per hour. While this may seem expensive, Google DeepMind researcher Jon Barron compared it to the cost of traditional filmmaking, noting that the blockbuster "Avengers: Endgame" cost around $32,000 per second to produce.

Veo 2 aims to create videos with realistic motion and high-quality output, up to 4K resolution, based on simple text prompts. While it's not the cheapest option compared to alternatives like OpenAI's Sora, which costs $200 per month, Google is targeting filmmakers and studios with larger budgets. The primary customers for Veo are filmmakers and studios, who typically have bigger budgets than film hobbyists. They would run Veo throughVertexAI, Google's platform for training and deploying advanced AI models."Veo 2 understands the unique language of cinematography: ask it for a genre, specify a lens, suggest cinematic effects and Veo 2 will deliver," Google says.

Recommended read:
References :
  • Shelly Palmer: Shelly Palmer discusses Google’s Veo 2, an AI video generator priced at 50 cents a second.
  • www.livescience.com: LiveScience reports Google's AI is now 'better than human gold medalists' at solving geometry problems.
  • PCMag Middle East ai: Google's Veo 2 Costs $1,800 Per Hour for AI-Generated Videos
  • THE DECODER: Google Deepmind sets pricing for Veo 2 AI video generation
  • Dataconomy: Google Veo 2 pricing: 50 cents per second of AI-generated video
  • TechCrunch: Reports Google’s new AI video model Veo 2 will cost 50 cents per second.