News from the AI & ML world

DeeperML - #dolphingemma

@Google DeepMind Blog //
Google is integrating its Veo 2 video-generating AI model into Gemini Advanced, allowing subscribers to create short, cinematic videos from text prompts. The new feature, launched on April 15, 2025, enables Gemini Advanced users to generate 8-second, 720p videos in a 16:9 aspect ratio, suitable for sharing on platforms like TikTok and YouTube. These videos can be downloaded as MP4 files and include Google's SynthID watermark, ensuring transparency regarding AI-generated content. Currently, this offering is exclusively for Google One AI Premium subscribers and does not extend to Google Workspace business and educational plans.

Veo 2 is also being integrated into Whisk, an experimental tool within Google Labs. This integration includes a new feature called "Whisk Animate" that transforms uploaded images into animated video clips, also utilizing the Veo 2 model. Similar to Gemini, the video output in Whisk is limited to eight seconds and is accessible only to Premium subscribers. The integration of Veo 2 into Gemini Advanced and Whisk represents Google's efforts to compete with other AI video generation platforms.

Google's Veo 2 is designed to turn detailed text prompts into cinematic-quality videos with lifelike motion, natural physics, and visually rich scenes. The system is able to interpret detailed text prompts and turn them into fully animated clips with lifelike elements and a strong visual narrative. To ensure responsible use and transparency, Google employs its proprietary SynthID technology, which embeds an invisible watermark into each video frame. The company also implements red-teaming and additional review processes to prevent the creation of content that violates its content policies. The new video generation features are being rolled out globally and support all languages currently available in Gemini.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • Google DeepMind Blog: Generate videos in Gemini and Whisk with Veo 2
  • PCMag Middle East ai: With Veo 2, videos are now free to produce for those on Advanced plans. The Whisk Animate tool also allows you to make images into 8-second videos using the same technology.
  • TestingCatalog: Gemini Advanced subscribers can now generate videos with Veo 2
  • THE DECODER: Google adds AI video generation to Gemini app and Whisk experiment
  • Analytics Vidhya: 3 Ways to Access Google Veo 2
  • www.tomsguide.com: I just tried Google's newest AI video generation features — and I'm blown away
  • www.analyticsvidhya.com: 3 Ways to Access Google Veo 2
  • LearnAI: Starting today, Gemini Advanced users can generate and share videos using our state-of-the-art video model, Veo 2. In Gemini, you can now translate text-based prompts into dynamic videos. Google Labs is also making Veo 2 available through Whisk, a generative AI experiment that allows you to create new images using both text and image prompts,...
  • www.tomsguide.com: Google rolls out Google Photos extension for Gemini — here’s what it can do
  • eWEEK: Gemini Advanced users can now create and share high-resolution videos with its newly released Veo 2.
  • Data Phoenix: Google introduces Veo 2 for video generation in Gemini and Whisk
Classification:
Dr. Thad@The Official Google Blog //
Google has introduced DolphinGemma, a new AI model designed to decipher dolphin communication. Developed in collaboration with the Wild Dolphin Project (WDP) and researchers at Georgia Tech, DolphinGemma aims to analyze and generate dolphin vocalizations, potentially paving the way for interspecies communication. For decades, scientists have attempted to understand the complex whistles and clicks dolphins use. With DolphinGemma, researchers hope to decode these sounds and gain insights into the structure and patterns of dolphin communication. The ultimate goal is to determine if dolphins possess a language and eventually, potentially communicate with them.

The foundation for DolphinGemma's development lies in the WDP's extensive data collection of recordings of sounds from bottlenose dolphins. The WDP has been studying a specific community of Atlantic spotted dolphins since 1985, using a non-invasive approach. Over decades, the WDP has created video and audio recordings of dolphins, along with correlating notes on their behaviors. DolphinGemma uses Google's SoundStream tokenizer to identify patterns and sequences. By analyzing this massive dataset, DolphinGemma can identify patterns and uncover potential meanings within the dolphins' natural communication, which previously required immense human effort.

Field testing of DolphinGemma is scheduled to begin this summer. During field research, sounds can be recorded on Pixel phones and analyzed with DolphinGemma. The model can also predict the subsequent sounds a dolphin may make, much like how large language models for human language predict the next word or token in a sentence. While understanding dolphin communication is the initial focus, the long-term vision includes establishing a shared vocabulary for interactive communication, utilizing synthetic sounds that dolphins could learn, akin to teaching them a new language. The WDP is also working with the Georgia Institute of Technology to teach dolphins a simple, shared vocabulary, using an underwater computer system called CHAT.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
Classification: