News from the AI & ML world

DeeperML - #efficiency

@www.searchenginejournal.com //
References: Search Engine Journal , WhatIs ,
Google is aggressively expanding its artificial intelligence capabilities across its platforms, integrating the Gemini AI model into Search, and Android XR smart glasses. The tech giant unveiled the rollout of "AI Mode" in the U.S. Search, making it accessible to all users after initial testing in the Labs division. This move signifies a major shift in how people interact with the search engine, offering a conversational experience akin to consulting with an expert.

Google is feeding its latest AI model, Gemini 2.5, into its search algorithms, enhancing features like "AI Overviews" which are now available in over 200 countries and 40 languages and are used by 1.5 billion monthly users. In addition, Gemini 2.5 Pro introduces enhanced reasoning, through Deep Think, to give deeper and more thorough responses with AI Mode with Deep Search. Google is also testing new AI-powered features, including the ability to conduct searches through live video feeds with Search Live.

Google is also re-entering the smart glasses market with Android XR-powered spectacles featuring a hands-free camera and a voice-powered AI assistant. This project, named Astra, allows users to talk back and forth with Search about what they see in real-time with their cameras. These advancements aim to create more personalized and efficient user experiences, marking a new phase in the AI platform shift and solidifying AI's position in search.

Recommended read:
References :
  • Search Engine Journal: Google Expands AI Features in Search: What You Need to Know
  • WhatIs: Google expands Gemini model, Search as AI rivals encroach
  • www.theguardian.com: Google unveils ‘AI Mode’ in the next phase of its journey to change search

@Salesforce //
References: Salesforce , Salesforce
Agentic AI is rapidly transforming various sectors, from government operations to small businesses. Salesforce executives highlight the potential of agentic AI to assist overstretched government workers by automating routine tasks and improving efficiency. The focus is shifting from automating basic tasks to creating intelligent systems that adapt and learn, providing personalized and efficient support. This evolution promises to reshape how work is done, streamlining processes and enhancing productivity.

Companies are quickly adopting AI agents to enhance customer support and streamline operations, leading to a new competitive landscape. Microsoft has launched powerful AI agents designed to transform the workday and challenge Google’s workplace dominance. These agents, such as the 'Researcher' and 'Analyst' agents, are powered by OpenAI’s deep reasoning models and can handle complex tasks, such as research and data analysis, that previously required specialized human expertise. This increased productivity across sectors signifies a major shift in how businesses operate.

Dynamics 365 Customer Service now offers three AI service agents in public preview: Case Management, Customer Intent, and Customer Knowledge Management agents. These agents learn to address emerging issues, uncover new knowledge, and automate manual processes to boost business efficiency and reduce costs. The Case Management Agent automates key tasks throughout the lifecycle of a case, while the Customer Intent Agent uses generative AI to analyze past interactions and provide tailored solutions. This represents a significant step towards autonomous systems that improve customer experiences and reduce the burden on human agents.

Recommended read:
References :
  • Salesforce: Salesforce execs on agentic AI for government workers.
  • Salesforce: AI Agents: A New Competitive Edge for SMBs

@www.microsoft.com //
Microsoft is at the forefront of a workplace revolution, driven by the rapid advancement of AI. According to their 2025 Work Trend Index, AI agents are transforming how businesses operate, particularly in customer service and security operations. These agents, powered by sophisticated AI, are designed to augment human capabilities, enabling companies to scale rapidly, operate with agility, and generate value faster. The report highlights the emergence of "Frontier Firms," organizations built around on-demand AI and human-agent teams, where employees act as "agent bosses."

Microsoft envisions a future where every employee will have an AI assistant, and AI agents will join teams as "digital colleagues," taking on specific tasks. Eventually, humans will set directions for these agents, who will then execute business processes and workflows independently, with their human supervisors checking in as needed. This shift represents a move from simple coding assistance to AI agents capable of handling complex tasks, such as end-to-end logistics in a supply chain, while humans guide the system and manage relationships with suppliers. This transformation is expected to impact various knowledge work professions, including scientists, academics, and lawyers.

The company also introduced AI service agents for Dynamics 365 Customer Service and Contact Center. These agents are available in public preview and include Case Management, Customer Intent, and Customer Knowledge Management agents. These AI agents learn to address emerging issues, uncover new knowledge, and automate manual processes to boost business efficiency and reduce costs. The Case Management Agent simplifies case management, reduces handling time, and improves customer satisfaction, while the Customer Intent Agent uses generative AI to analyze past interactions and provide tailored solutions. Microsoft is also emphasizing the importance of securing, managing, and measuring agent workstreams with the Copilot Control System, ensuring that businesses can effectively mitigate risks and track the ROI of their AI agent deployments.

Recommended read:
References :

@arstechnica.com //
Microsoft researchers have achieved a significant breakthrough in AI efficiency with the development of a 1-bit large language model (LLM) called BitNet b1.58 2B4T. This model, boasting two billion parameters and trained on four trillion tokens, stands out due to its remarkably low memory footprint and energy consumption. Unlike traditional AI models that rely on 16- or 32-bit floating-point formats for storing numerical weights, BitNet utilizes only three distinct weight values: -1, 0, and +1. This "ternary" architecture dramatically reduces complexity, enabling the AI to run efficiently on a standard CPU, even an Apple M2 chip, according to TechCrunch.

The development of BitNet b1.58 2B4T represents a significant advancement in the field of AI, potentially paving the way for more accessible and sustainable AI applications. This 1-bit model, available on Hugging Face, uses a novel approach of representing each weight with a single bit. While this simplification can lead to a slight reduction in accuracy compared to larger, more complex models, BitNet b1.58 2B4T compensates through its massive training dataset, comprising over 33 million books. The reduction in memory usage is substantial, with the model requiring only 400MB of non-embedded memory, significantly less than comparable models.

Comparisons against leading mainstream models like Meta’s LLaMa 3.2 1B, Google’s Gemma 3 1B, and Alibaba’s Qwen 2.5 1.5B have shown that BitNet b1.58 2B4T performs competitively across various benchmarks. In some instances, it has even outperformed these models. However, to achieve optimal performance and efficiency, the LLM must be used with the bitnet.cpp inference framework. This highlights a current limitation as the model does not run on GPU and requires a proprietary framework. Despite this, the creation of such a lightweight and efficient LLM marks a crucial step toward future AI that may not necessarily require supercomputers.

Recommended read:
References :
  • arstechnica.com: Microsoft Researchers Create Super‑Efficient AI That Uses Up to 96% Less Energy
  • www.techrepublic.com: Microsoft Releases Largest 1-Bit LLM, Letting Powerful AI Run on Some Older Hardware
  • www.tomshardware.com: Microsoft researchers build 1-bit AI LLM with 2B parameters — model small enough to run on some CPUs

@www.analyticsvidhya.com //
OpenAI recently unveiled its groundbreaking o3 and o4-mini AI models, representing a significant leap in visual problem-solving and tool-using artificial intelligence. These models can manipulate and reason with images, integrating them directly into their problem-solving process. This unlocks a new class of problem-solving that blends visual and textual reasoning, allowing the AI to not just see an image, but to "think with it." The models can also autonomously utilize various tools within ChatGPT, such as web search, code execution, file analysis, and image generation, all within a single task flow.

These models are designed to improve coding capabilities, and the GPT-4.1 series includes GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. GPT-4.1 demonstrates enhanced performance and lower prices, achieving a 54.6% score on SWE-bench Verified, a significant 21.4 percentage point increase from GPT-4o. This is a big gain in practical software engineering capabilities. Most notably, GPT-4.1 offers up to one million tokens of input context, compared to GPT-4o's 128k tokens, making it suitable for processing large codebases and extensive documentation. GPT-4.1 mini and nano also offer performance boosts at reduced latency and cost.

The new models are available to ChatGPT Plus, Pro, and Team users, with Enterprise and education users gaining access soon. While reasoning alone isn't a silver bullet, it reliably improves model accuracy and problem-solving capabilities on challenging tasks. With Deep Research products and o3/o4-mini, AI-assisted search-based research is now effective.

Recommended read:
References :
  • bdtechtalks.com: What to know about o3 and o4-mini, OpenAI’s new reasoning models
  • TestingCatalog: OpenAI’s o3 and o4‑mini bring smarter tools and faster reasoning to ChatGPT
  • thezvi.wordpress.com: OpenAI has finally introduced us to the full o3 along with o4-mini. These models feel incredibly smart.
  • venturebeat.com: OpenAI launches groundbreaking o3 and o4-mini AI models that can manipulate and reason with images, representing a major advance in visual problem-solving and tool-using artificial intelligence.
  • www.techrepublic.com: OpenAI’s o3 and o4-mini models are available now to ChatGPT Plus, Pro, and Team users. Enterprise and education users will get access next week.
  • the-decoder.com: OpenAI's o3 achieves near-perfect performance on long context benchmark
  • the-decoder.com: Safety assessments show that OpenAI's o3 is probably the company's riskiest AI model to date
  • www.unite.ai: Inside OpenAI’s o3 and o4‑mini: Unlocking New Possibilities Through Multimodal Reasoning and Integrated Toolsets
  • thezvi.wordpress.com: Discusses the release of OpenAI's o3 and o4-mini reasoning models and their enhanced capabilities.
  • Simon Willison's Weblog: OpenAI o3 and o4-mini System Card
  • Interconnects: OpenAI's o3: Over-optimization is back and weirder than ever. Tools, true rewards, and a new direction for language models.
  • techstrong.ai: Nobody’s Perfect: OpenAI o3, o4 Reasoning Models Have Some Kinks
  • bsky.app: It's been a couple of years since GPT-4 powered Bing, but with the various Deep Research products and now o3/o4-mini I'm ready to say that AI assisted search-based research actually works now
  • www.analyticsvidhya.com: o3 vs o4-mini vs Gemini 2.5 pro: The Ultimate Reasoning Battle
  • pub.towardsai.net: TAI#149: OpenAI’s Agentic o3; New Open Weights Inference Optimized Models (DeepMind Gemma, Nvidia Nemotron-H) Also, Grok-3 Mini Shakes Up Cost Efficiency, Codex, Cohere Embed 4, PerceptionLM & more.
  • Last Week in AI: Last Week in AI #307 - GPT 4.1, o3, o4-mini, Gemini 2.5 Flash, Veo 2
  • composio.dev: OpenAI o3 vs. Gemini 2. 5 Pro vs. o4-mini
  • Towards AI: Details about Open AI's Agentic O3 models

Eira May@Stack Overflow Blog //
AI agents are rapidly transforming business operations across various sectors, promising to automate tasks, enhance efficiency, and streamline workflows. Companies are integrating these intelligent systems to modernize customer experiences and unlock enterprise value. To fully leverage the potential of AI agents, businesses need to ensure they have real-time and seamless connections to company databases, internal communication tools, and documents. This integration is crucial for the agents to provide contextually aware and valuable assistance.

Saltbox Mgmt, a Salesforce consulting company, has successfully implemented Agentforce to modernize the buying experience, resulting in improved efficiency and enhanced personalization. Moreover, the integration of AI in real estate technology presents opportunities for strategic transformation, boosting efficiency, value, and decision-making capabilities. However, AI assistants are only as effective as the knowledge base they are connected to, highlighting the importance of comprehensive and up-to-date internal data.

Recommended read:
References :
  • bdtechtalks.com: bdtechtalks.com - Why corporate real estate should adopt AI to unlock enterprise value
  • Stack Overflow Blog: stackoverflow.blog - “Are AI agents ready for the enterprise?”
  • AI Accelerator Institute: aiacceleratorinstitute.com - AI assistants: Only as smart as your knowledge base
  • Salesforce: salesforce.com - Saltbox Mgmt Modernizes the Buying Experience with Agentforce, Boosting Efficiency and Personalization
  • Bernard Marr: AI Agents Are Coming For Your Job Tasks—Here's How To Stay Ahead

Koray Kavukcuoglu@The Official Google Blog //
Google has unveiled Gemini 2.5 Pro, touted as its "most intelligent model to date," enhancing AI reasoning and workflow capabilities. This multimodal model, available to Gemini Advanced users and experimentally on Google’s AI Studio, outperforms competitors like OpenAI, Anthropic, and DeepSeek on key benchmarks, particularly in coding, math, and science. Gemini 2.5 Pro boasts an impressive 1 million token context window, soon expanding to 2 million, enabling it to handle larger datasets and understand entire code repositories.

Gemini 2.5 Pro excels in advanced reasoning benchmark tests, achieving a state-of-the-art score on datasets designed to capture human knowledge and reasoning. Its enhanced coding performance allows for the creation of visually compelling web apps and agentic code applications, along with code transformation and editing. Google plans to release pricing for Gemini 2.5 models soon, marking a significant step in their goal of developing more capable and context-aware AI agents.

Recommended read:
References :

matthewthomas@Microsoft Industry Blogs //
References: Source ,
Microsoft is emphasizing both AI security and advancements in quantum computing. The company is integrating AI features across its products and services, including Microsoft 365, while also highlighting the critical intersection of AI innovation and security. Microsoft will host Microsoft Secure on April 9th, an online event designed to help professionals discover AI innovations for the security lifecycle. Attendees can learn how to harden their defenses, secure AI investments, and discover AI-first tools and best practices.

Microsoft is also continuing its work in quantum computing, recently defending its topological qubit claims at the American Physical Society (APS) meeting. While Microsoft maintains confidence in its results, skepticism remains within the scientific community regarding the verification methods used, particularly the reliability of the topological gap protocol (TGP) in detecting Majorana quasiparticles. Chetan Nayak, a leading theoretical physicist at Microsoft, presented the company’s findings, acknowledging the skepticism but insisting that the team is confident.

Recommended read:
References :
  • Source: AI innovation requires AI security: Hear what’s new at Microsoft Secure
  • : Microsoft defends topological qubit claims at APS Meeting Amid Skepticism

Tris Warkentin@The Official Google Blog //
Google AI has released Gemma 3, a new family of open-source AI models designed for efficient and on-device AI applications. Gemma 3 models are built with technology similar to Gemini 2.0, intended to run efficiently on a single GPU or TPU. The models are available in various sizes: 1B, 4B, 12B, and 27B parameters, with options for both pre-trained and instruction-tuned variants, allowing users to select the model that best fits their hardware and specific application needs.

Gemma 3 offers practical advantages including efficiency and portability. For example, the 27B version has demonstrated robust performance in evaluations while still being capable of running on a single GPU. The 4B, 12B, and 27B models are capable of processing both text and images, and supports more than 140 languages. The models have a context window of 128,000 tokens, making them well suited for tasks that require processing large amounts of information. Google has built safety protocols into Gemma 3, including a safety checker for images called ShieldGemma 2.

Recommended read:
References :
  • MarkTechPost: Google AI Releases Gemma 3: Lightweight Multimodal Open Models for Efficient and On‑Device AI
  • The Official Google Blog: Introducing Gemma 3: The most capable model you can run on a single GPU or TPU
  • AI News | VentureBeat: Google unveils open source Gemma 3 model with 128k context window
  • AI News: Details on the launch of Gemma 3 open AI models by Google.
  • The Verge: Google calls Gemma 3 the most powerful AI model you can run on one GPU
  • Maginative: Google DeepMind’s Gemma 3 Brings Multimodal AI, 128K Context Window, and More
  • TestingCatalog: Gemma 3 sets new benchmarks for open compact models with top score on LMarena
  • AI & Machine Learning: Announcing Gemma 3 on Vertex AI
  • Analytics Vidhya: Gemma 3 vs DeepSeek-R1: Is Google’s New 27B Model a Tough Competition to the 671B Giant?
  • AI & Machine Learning: How to deploy serverless AI with Gemma 3 on Cloud Run
  • The Tech Portal: Google rolls outs Gemma 3, its latest collection of lightweight AI models
  • eWEEK: Google’s Gemma 3: Does the ‘World’s Best Single-Accelerator Model’ Outperform DeepSeek-V3?
  • The Tech Basic: Gemma 3 by Google: Multilingual AI with Image and Video Analysis
  • Analytics Vidhya: Google’s Gemma 3: Features, Benchmarks, Performance and Implementation
  • www.infoworld.com: Google unveils Gemma 3 multi-modal AI models
  • www.zdnet.com: Google claims Gemma 3 reaches 98% of DeepSeek's accuracy - using only one GPU
  • AIwire: Google unveiled open source Gemma 3, is multimodal, comes in four sizes and can now handle more information and instructions thanks to a larger context window. The post appeared first on .
  • Ars OpenForum: Google’s new Gemma 3 AI model is optimized to run on a single GPU
  • THE DECODER: Google DeepMind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.
  • Gradient Flow: Gemma 3: What You Need To Know
  • Interconnects: Gemma 3, OLMo 2 32B, and the growing potential of open-source AI
  • OODAloop: Gemma 3, Google's newest lightweight, open-source AI model, is designed for multimodal tasks and efficient deployment on various devices.
  • NVIDIA Technical Blog: Google has released lightweight, multimodal, multilingual models called Gemma 3. The models are designed to run efficiently on phones and laptops.
  • LessWrong: Google DeepMind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.

Matthew S.@IEEE Spectrum //
Recent research indicates that AI models, particularly large language models (LLMs), can struggle with overthinking and analysis paralysis, impacting their efficiency and success rates. A study has found that reasoning LLMs sometimes overthink problems, which leads to increased computational costs and a reduction in their overall performance. This issue is being addressed through various optimization techniques, including scaling inference-time compute, reinforcement learning, and supervised fine-tuning, to ensure models use only the necessary amount of reasoning for tasks.

The size and training methods of these models play a crucial role in their reasoning abilities. For instance, Alibaba's Qwen team introduced QwQ-32B, a 32-billion-parameter model that outperforms much larger rivals in key problem-solving tasks. QwQ-32B achieves superior performance in math, coding, and scientific reasoning using multi-stage reinforcement learning, despite being significantly smaller than DeepSeek-R1. This advancement highlights the potential of reinforcement learning to unlock reasoning capabilities in smaller models, rivaling the performance of giant models while requiring less computational power.

Recommended read:
References :
  • IEEE Spectrum: It’s Not Just Us: AI Models Struggle With Overthinking
  • Sebastian Raschka, PhD: This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.