Aminu Abdullahi@eWEEK
//
Google has unveiled significant advancements in its AI-driven media generation capabilities at Google I/O 2025, showcasing updates to Veo, Imagen, and Flow. The updates highlight Google's commitment to pushing the boundaries of AI in video and image creation, providing creators with new and powerful tools. A key highlight is the introduction of Veo 3, the first video generation model with integrated audio capabilities, addressing a significant challenge in AI-generated media by enabling synchronized audio creation for videos.
Veo 3 allows users to generate high-quality visuals with synchronized audio, including ambient sounds, dialogue, and environmental noise. According to Google, the model excels at understanding complex prompts, bringing short stories to life in video format with realistic physics and accurate lip-syncing. Veo 3 is currently available to Ultra subscribers in the US through the Gemini app and Flow platform, as well as to enterprise users via Vertex AI, demonstrating Google’s intent to democratize AI-driven content creation across different user segments. In addition to Veo 3, Google has launched Imagen 4 and Flow, an AI filmmaking tool, alongside major updates to Veo 2. Veo 2 is receiving enhancements with filmmaker-focused features, including the use of images as references for character and scene consistency, precise camera controls, outpainting capabilities, and object manipulation tools. Flow integrates the Veo, Imagen, and Gemini models into a comprehensive platform allowing creators to manage story elements and create content with natural language narratives, making it easier than ever to bring creative visions to life. References :
Classification:
@www.microsoft.com
//
Microsoft is pushing forward on multiple fronts to enhance its AI offerings, particularly within the Copilot ecosystem. Recent updates include the testing of new voices, "Birch" and "Rain," alongside a sneak peek at a fourth avatar, "Ellie," for the assistant. These additions aim to personalize the Copilot experience across Windows, web, and mobile platforms, giving it a clearer identity without fundamentally altering its core language model with each update. The new avatar, Ellie, is currently under development, and while only its background is loading, the animated figure is absent, hinting at a release window that is still undefined. These incremental avatar and voice additions are part of a broader strategy to give Copilot a clearer personality.
Microsoft's Semantic Telemetry Project is revealing insights into user engagement with AI. The data shows a strong correlation between the complexity and professional nature of tasks undertaken with AI and the likelihood of continued and increased usage. Individuals employing AI for more technical, complex, and professional tasks are more inclined to continue using the tool and to interact with it more frequently. Novice AI users tend to start with simpler tasks, but the complexity of their engagement increases over time. However, more expert users are satisfied with AI responses only where AI expertise is on par with their own expertise on the topic, while novice users had low satisfaction rates regardless of AI expertise. Furthermore, Microsoft is tackling AI model efficiency with the development of BitNet b1.58 2B4T, a 1-bit large language model (LLM) featuring two billion parameters. This model is designed to run efficiently on CPUs, even an Apple M2 chip. BitNet achieves this efficiency through its 1.58-bit weights, using only three possible values (-1, 0, and +1), significantly reducing memory requirements and computational power compared to traditional models. While BitNet’s simplicity makes it less accurate compared to larger AI models, it compensates with a massive training dataset. The model is readily available on Hugging Face, allowing experimentation with it. References :
Classification:
|
BenchmarksBlogsResearch Tools |