News from the AI & ML world

DeeperML - #aiupdates

Aminu Abdullahi@eWEEK //
Google has unveiled significant advancements in its AI-driven media generation capabilities at Google I/O 2025, showcasing updates to Veo, Imagen, and Flow. The updates highlight Google's commitment to pushing the boundaries of AI in video and image creation, providing creators with new and powerful tools. A key highlight is the introduction of Veo 3, the first video generation model with integrated audio capabilities, addressing a significant challenge in AI-generated media by enabling synchronized audio creation for videos.

Veo 3 allows users to generate high-quality visuals with synchronized audio, including ambient sounds, dialogue, and environmental noise. According to Google, the model excels at understanding complex prompts, bringing short stories to life in video format with realistic physics and accurate lip-syncing. Veo 3 is currently available to Ultra subscribers in the US through the Gemini app and Flow platform, as well as to enterprise users via Vertex AI, demonstrating Google’s intent to democratize AI-driven content creation across different user segments.

In addition to Veo 3, Google has launched Imagen 4 and Flow, an AI filmmaking tool, alongside major updates to Veo 2. Veo 2 is receiving enhancements with filmmaker-focused features, including the use of images as references for character and scene consistency, precise camera controls, outpainting capabilities, and object manipulation tools. Flow integrates the Veo, Imagen, and Gemini models into a comprehensive platform allowing creators to manage story elements and create content with natural language narratives, making it easier than ever to bring creative visions to life.

Recommended read:
References :
  • Data Phoenix: Google updated its model lineup and introduced a 'Deep Think' reasoning mode for Gemini 2.5 Pro
  • Maginative: Google’s revamped Canvas, powered by the Gemini 2.5 Pro model, lets you turn ideas into apps, quizzes, podcasts, and visuals in seconds—no code required.
  • Replicate's blog: Generate incredible images with Google's Imagen-4
  • AI News | VentureBeat: At Google I/O, Sergey Brin makes surprise appearance — and declares Google will build the first AGI
  • www.tomsguide.com: I just tried Google’s smart glasses built on Android XR — and Gemini is the killer feature
  • Data Phoenix: Google has launched major Gemini updates, including free visual assistance via Gemini Live, new subscription tiers starting at $19.99/month, advanced creative tools like Veo 3 for video generation with native audio, and an upcoming autonomous Agent Mode for complex task management.
  • sites.libsyn.com: Google's VEO 3 Is Next Gen AI Video, Gemini Crushes at Google I/O & OpenAI's Big Bet on Jony Ive
  • eWEEK: Google’s Co-Founder in Office ‘Pretty Much Every Day’ to Work on AI
  • learn.aisingapore.org: Advancing Gemini’s security safeguards – Google DeepMind
  • Google DeepMind Blog: Gemini 2.5: Our most intelligent models are getting even better
  • TestingCatalog: Opus 4 outperforms GPT-4.1 and Gemini 2.5 Pro in coding benchmarks
  • LearnAI: Updates to Gemini 2.5 from Google DeepMind
  • pub.towardsai.net: This week, Google’s flagship I/O 2025 conference and Anthropic’s Claude 4 release delivered further advancements in AI reasoning, multimodal and coding capabilities, and somewhat alarming safety testing results.
  • learn.aisingapore.org: Updates to Gemini 2.5 from Google DeepMind
  • Data Phoenix: Google announced several updates across its media generation models
  • thezvi.wordpress.com: Fun With Veo 3 and Media Generation
  • Maginative: Google Gemini Can Now Watch Your Videos on Google Drive
  • www.marktechpost.com: A Coding Guide for Building a Self-Improving AI Agent Using Google’s Gemini API with Intelligent Adaptation Features

@www.microsoft.com //
Microsoft is pushing forward on multiple fronts to enhance its AI offerings, particularly within the Copilot ecosystem. Recent updates include the testing of new voices, "Birch" and "Rain," alongside a sneak peek at a fourth avatar, "Ellie," for the assistant. These additions aim to personalize the Copilot experience across Windows, web, and mobile platforms, giving it a clearer identity without fundamentally altering its core language model with each update. The new avatar, Ellie, is currently under development, and while only its background is loading, the animated figure is absent, hinting at a release window that is still undefined. These incremental avatar and voice additions are part of a broader strategy to give Copilot a clearer personality.

Microsoft's Semantic Telemetry Project is revealing insights into user engagement with AI. The data shows a strong correlation between the complexity and professional nature of tasks undertaken with AI and the likelihood of continued and increased usage. Individuals employing AI for more technical, complex, and professional tasks are more inclined to continue using the tool and to interact with it more frequently. Novice AI users tend to start with simpler tasks, but the complexity of their engagement increases over time. However, more expert users are satisfied with AI responses only where AI expertise is on par with their own expertise on the topic, while novice users had low satisfaction rates regardless of AI expertise.

Furthermore, Microsoft is tackling AI model efficiency with the development of BitNet b1.58 2B4T, a 1-bit large language model (LLM) featuring two billion parameters. This model is designed to run efficiently on CPUs, even an Apple M2 chip. BitNet achieves this efficiency through its 1.58-bit weights, using only three possible values (-1, 0, and +1), significantly reducing memory requirements and computational power compared to traditional models. While BitNet’s simplicity makes it less accurate compared to larger AI models, it compensates with a massive training dataset. The model is readily available on Hugging Face, allowing experimentation with it.

Recommended read:
References :
  • Microsoft Copilot Blog: Release Notes: April 16, 2025
  • THE DECODER: BitNet: Microsoft shows how to put AI models on a diet
  • www.microsoft.com: Semantic Telemetry Project data show that people who use AI for more professional and complex tasks are more likely to keep using the tool and to use it more often. Novice AI users engage in simpler tasks, but their usage is becoming more complex.
  • www.artificiallawyer.com: Judicial office holders in the UK are being encouraged to make use of Microsoft’s ‘Copilot Chat’ genAI capability via their inhouse eJudiciary platform.

Matthias Bastian@THE DECODER //
Google has announced significant upgrades to its Gemini app, focusing on enhanced functionality, personalization, and accessibility. A key update is the rollout of the upgraded 2.0 Flash Thinking Experimental model, now supporting file uploads and boasting a 1 million token context window for processing large-scale information. This model aims to improve reasoning and response efficiency by breaking down prompts into actionable steps. The Deep Research feature, powered by Flash Thinking, allows users to create detailed multi-page reports with real-time insights into its reasoning process and is now available globally in over 45 languages, accessible for free or with expanded access for Gemini Advanced users.

Another major addition is the experimental "Personalization" feature, integrating Gemini with Google apps like Search to deliver tailored responses based on user activity. Gemini is also strengthening its integration with Google apps such as Calendar, Notes, Tasks, and Photos, enabling users to handle complex multi-app requests in a single prompt. Google is also putting Gemini 2.0 AI into robots through the DeepMind AI team, which has developed two new models of Gemini specifically designed to work with robots. The first, Gemini Robotics, is an advanced vision-language-action (VLA) LLM that uses physical motion to respond to prompts. The second model, Gemini Robots-ER, is a VLM with advanced spatial understanding, enabling robots to navigate changing environments. Google is partnering with robotics companies to further develop humanoid robots.

Google will replace its long-standing Google Assistant with Gemini on mobile devices later this year. The classic Google Assistant will no longer be accessible on most mobile devices, marking the end of an era. The shift represents Google's pivot toward generative AI, believing that Gemini's advanced AI capabilities will deliver a more powerful and versatile experience. Gemini will also come to tablets, cars, and connected devices like headphones and watches. The company also introduced Gemini Embedding, a novel embedding model initialized from the powerful Gemini Large Language Model, aiming to enhance embedding quality across diverse tasks.

Recommended read:
References :
  • The Official Google Blog: Over the coming months, we’ll be upgrading users on mobile devices from Google Assistant to Gemini.
  • Android Faithful: Google's AI tool Gemini gets a boost by working with deeper insight about you through personalization and app connections.
  • Search Engine Journal: Google Gemini's integration of Search history blurs the line between traditional Search and AI assistants
  • Maginative: Google will replace its long-standing Google Assistant with Gemini on mobile devices later this year, marking the end of an era for the company's original voice assistant.
  • MarkTechPost: Google AI Introduces Gemini Embedding: A Novel Embedding Model Initialized from the Powerful Gemini Large Language Model
  • www.tomsguide.com: Google is taking Gemini to the next level and giving users more with major upgrades aimed to make Gemini even more personal, plus many of the upgrades are free.
  • PCMag Middle East ai: RIP Google Assistant? Gemini AI Poised to Replace It This Year
  • The Tech Basic: Android’s New AI Era: Gemini Replaces Google Assistant This Year
  • Search Engine Land: Google to replace Google Assistant with Gemini
  • www.tomsguide.com: Google Assistant is losing features to make way for Gemini — here's what's just been axed
  • The Official Google Blog: Gemini gets personal, with tailored help from your Google apps
  • Analytics Vidhya: Google's Gemini models are undergoing significant updates, now featuring faster models, longer context lengths, and integrated AI agents.
  • Google DeepMind Blog: Gemini breaks new ground: a faster model, longer context and AI agents

Matthias Bastian@THE DECODER //
Google is enhancing its Gemini AI assistant with the ability to access users' Google Search history to deliver more personalized and relevant responses. This opt-in feature allows Gemini to analyze a user's search patterns and incorporate that information into its responses. The update is powered by the experimental Gemini 2.0 Flash Thinking model, which the company launched in late 2024.

This new capability, known as personalization, requires explicit user permission. Google is emphasizing transparency by allowing users to turn the feature on or off at any time, and Gemini will clearly indicate which data sources inform its personalized answers. To test the new feature Google suggests users ask about vacation spots, YouTube content ideas, or potential new hobbies. The system then draws on individual search histories to make tailored suggestions.

Recommended read:
References :
  • Android Faithful: Google's AI tool Gemini gets a boost by working with deeper insight about you through personalization and app connections.
  • Google DeepMind Blog: Experiment with Gemini 2.0 Flash native image generation
  • THE DECODER: Google adds native image generation to Gemini language models
  • THE DECODER: Google's Gemini AI assistant can now tap into users' search histories to provide more personalized responses, marking a significant expansion of the chatbot's capabilities.
  • TestingCatalog: Discover the latest updates to Google's Gemini app, featuring the new 2.0 Flash Thinking model, enhanced personalization, and deeper integration with Google apps.
  • The Official Google Blog: Gemini gets personal, with tailored help from your Google apps
  • Search Engine Journal: Google Search History Can Now Power Gemini AI Answers
  • www.zdnet.com: Gemini might soon have access to your Google Search history - if you let it
  • The Official Google Blog: The Assistant experience on mobile is upgrading to Gemini
  • www.zdnet.com: Google launches Gemini with Personalization, beating Apple to personal AI
  • Maginative: Google to Replace Google Assistant with Gemini on Android Phones
  • www.tomsguide.com: Google is giving away Gemini's best paid features for free — here's the tools you can try now
  • MacSparky: This article reports on Google's integration of Gemini AI into its search engine and discusses the implications for users and creators.
  • Search Engine Land: This change will roll out to most devices except Android 9 or earlier (and some other devices).
  • www.zdnet.com: Gemini's new features are now available for free, extending beyond its previous paid subscriber model.
  • www.techradar.com: Discusses how Google is giving Gemini a superpower by allowing it to access your Search history, raising excitement and concerns.
  • PCMag Middle East ai: This article discusses Google's plan to replace Google Assistant with Gemini AI, highlighting the timeline for the transition and requirements for the devices.
  • The Tech Basic: This article announces Google’s plan to replace Google Assistant with Gemini, focusing on the company’s focus on advancing AI and integrating Gemini into its mobile product ecosystem.
  • Verdaily: Google Announces New Update for its AI Wizard, Gemini: Improves User Experience
  • Windows Copilot News: Google is prepping Gemini to take action inside of apps
  • www.techradar.com: Worried about DeepSeek? Well, Google Gemini collects even more of your personal data
  • Maginative: Gemini App Gets a Major Upgrade: Canvas Mode, Audio Overviews, and More
  • TestingCatalog: Google launches Canvas and Audio Overview for all Gemini users
  • Android Faithful: Google Gemini Gets A Powerful Collaborative Upgrade: Canvas and Audio Overviews Now Available