News from the AI & ML world

DeeperML - #robotics

Sam Khosroshahi@lambdalabs.com //
References: Fello AI , lambdalabs.com ,
NVIDIA is pushing the boundaries of artificial intelligence in healthcare and robotics, introducing several groundbreaking advancements. One notable innovation is the DNA LLM, designed to decode the complex genetic information found in DNA, RNA, and proteins. This tool aims to transform genomic research, potentially leading to new understandings and treatments for various diseases.

The company's commitment to AI extends to robotics with the release of Isaac GR00T N1, an open-source platform for humanoid robots. This initiative is expected to accelerate innovation in the field, providing developers with the resources needed to create more advanced and capable robots. Additionally, an NVIDIA research team has developed Hymba, a family of small language models that combine transformer attention with state space models, surpassing the Llama-3.2-3B model in performance while significantly reducing cache size and increasing throughput.

Recommended read:
References :
  • Fello AI: NVIDIA DNA LLM: The Power To Curing All Diseases?
  • lambdalabs.com: Lambda Honored to Accelerate AI Innovation in Healthcare with NVIDIA
  • Synced: NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small Language Models

Dean Takahashi@AI News | VentureBeat //
NVIDIA, Google DeepMind, and Disney Research are collaborating on Newton, an open-source physics engine designed to advance robot learning, enhance simulation accuracy, and facilitate the development of next-generation robotic characters. Newton is built on NVIDIA’s Warp framework and aims to provide a scalable, high-performance simulation environment optimized for AI-driven humanoid robots. MuJoCo-Warp, a collaboration with Google DeepMind, accelerates robotics workloads by over 70x, while Disney plans to integrate Newton into its robotic character platform for expressive, interactive robots.

The engine's creation is intended to bridge the gap between simulation and real-world robotics. NVIDIA will also supercharge humanoid robot development with the Isaac GR00T N1 foundation model for human-like reasoning. Newton is built on NVIDIA Warp, a CUDA-based acceleration library that enables GPU-powered physics simulations. Newton is also optimized for robot learning frameworks, including MuJoCo Playground and NVIDIA Isaac Lab, making it an essential tool for developers working on generalist humanoid robots. This initiative is part of NVIDIA's broader effort to accelerate physical AI progress.

Recommended read:
References :
  • AI News | VentureBeat: Nvidia will supercharge humanoid robot development with Isaac GR00T N1 foundation model for human-like reasoning
  • Maginative: NVIDIA, Google DeepMind, and Disney Research Team Up for Open-Source Physics Engine
  • BigDATAwire: The Rise of Intelligent Machines: Nvidia Accelerates Physical AI Progress
  • LearnAI: From innovation to impact: How AWS and NVIDIA enable real-world generative AI success

Michael Nuñez@venturebeat.com //
References: venturebeat.com , AIwire ,
Nvidia has made significant strides in enhancing robot training and AI capabilities, unveiling innovative solutions at its GTC conference. A key announcement was Cosmos-Transfer1, a groundbreaking AI model designed to generate photorealistic simulations for training robots and autonomous vehicles. This model bridges the gap between virtual training environments and real-world applications by using multimodal inputs to create highly realistic simulations. This adaptive multimodal control system allows developers to weight different visual inputs, such as depth or object boundaries, to improve the realism and utility of the generated environments.

Nvidia also introduced its next-generation GPU superchips, including the second generation of the Grace Blackwell chip and the Vera Rubin, expected in the second half of 2026. The Vera Rubin will feature 288GB of high-bandwidth memory 4 (HBM4) and will be paired with CPUs boasting 88 custom Arm cores. These new chips promise substantial increases in compute capacity, with Rubin delivering a 900x speedup compared to the previous generation Hopper chips. This positions Nvidia to tackle the increasing demands of generative AI workloads, including training massive AI models and running inference workloads.

Recommended read:
References :
  • venturebeat.com: Nvidia’s Cosmos-Transfer1 makes robot training freakishly realistic—and that changes everything
  • AIwire: Nvidia Touts Next Generation GPU Superchip and New Photonic Switches
  • www.laptopmag.com: Blackwell Ultra and Rubin Ultra are Nvidia's newest additions to the growing list of AI superchips

Shelly Palmer@Shelly Palmer //
AI agents are rapidly evolving, becoming a focal point in the tech industry. OpenAI has released an API designed to make the deployment of AI agents significantly easier, opening up new possibilities for developers. These agents are capable of tasks like searching the web in real-time and analyzing extensive datasets, heralding a new era of automation and productivity. Experts predict substantial growth in the AI agent market, forecasting an increase from $5 billion in 2024 to over $47 billion by 2030.

AI agents are being designed as smart software programs that can understand and act independently to achieve assigned goals. Unlike chatbots or rigid scripts, AI agents break down complex tasks into manageable steps, adapt their strategies as needed, and learn from their experiences. Core elements include tasks, models (like GPT or Claude), memory for storing context, and external tools such as APIs and web access. These components work together iteratively, planning, acting, and adjusting as necessary without constant human intervention.

Recommended read:
References :
  • Shelly Palmer: AI Agents Are Coming—and OpenAI Just Made Them Easier to Deploy
  • Upward Dynamism: AI Agents 101 – The Next Big Thing in AI You Shouldn’t Ignore

@Google DeepMind Blog //
Google is pushing the boundaries of AI and robotics with its Gemini AI models. Gemini Robotics, an advanced vision-language-action model, now enables robots to perform physical tasks with improved generalization, adaptability, and dexterity. This model interprets and acts on text, voice, and image data, showcasing Google's advancements in integrating AI for practical applications. Furthermore, the development of Gemini Robotics-ER, which incorporates embodied reasoning capabilities, signifies another step toward smarter, more adaptable robots.

Google's approach to robotics emphasizes safety, employing both physical and semantic safety systems. The company is inviting filmmakers and creators to experiment with the model to improve the design and development. Veo builds on years of generative video model work, including Generative Query Network(GQN),DVD-GAN,Imagen-Video,Phenaki,WALT,VideoPoetandLumiere— combining architecture, scaling laws and other novel techniques to improve quality and output resolution.

Recommended read:
References :
  • Google DeepMind Blog: Gemini Robotics brings AI into the physical world
  • Maginative: Google DeepMind Unveils Gemini Robotics Models to Bridge AI and Physical World
  • IEEE Spectrum: With Gemini Robotics, Google Aims for Smarter Robots
  • The Official Google Blog: Take a closer look at our new Gemini models for robotics.
  • THE DECODER: Google Deepmind unveils new AI models for robotic control
  • www.tomsguide.com: Google is putting it's Gemini 2.0 AI into robots — here's how it's going
  • Verdict: Google DeepMind unveils Gemini AI models for robotics
  • MarkTechPost: Google DeepMind’s Gemini Robotics: Unleashing Embodied AI with Zero-Shot Control and Enhanced Spatial Reasoning
  • LearnAI: Research Published 12 March 2025 Authors Carolina Parada Introducing Gemini Robotics, our Gemini 2.0-based model designed for robotics At Google DeepMind, we’ve been making progress in how our Gemini models solve complex problems through multimodal reasoning across text, images, audio and video. So far however, those abilities have been largely confined to the digital realm....
  • OODAloop: Google DeepMind unveils new AI models for robotic control.
  • www.producthunt.com: Gemini Robotics
  • Last Week in AI: Last Week in AI #303 - Gemini Robotics, Gemma 3, CSM-1B
  • Windows Copilot News: Google is prepping Gemini to take action inside of apps
  • Last Week in AI: Discusses Gemini Robotics in the context of general AI agents and robotics.
  • www.infoq.com: Google DeepMind unveils Gemini Robotics, an advanced AI model for enhancing robotics through vision, language, and action.
  • AI & Machine Learning: This article discusses how generative AI is poised to revolutionize multiplayer games, offering personalized experiences through dynamic narratives and environments. The article specifically mentions Google's Gemini AI model and its potential to enhance gameplay.
  • Gradient Flow: This podcast episode discusses various advancements in AI, including Google's Gemini Robotics and Gemma 3, as well as the evolving regulatory landscape across different countries.
  • Insight Partners: This article highlights Andrew Ng's keynote at ScaleUp:AI '24, where he discusses the exciting trends in AI agents and applications, mentioning Google's Gemini AI assistant and its role in driving innovation.
  • www.tomsguide.com: You can now use Google Gemini without an account — here's how to get started

Tris Warkentin@The Official Google Blog //
Google AI has released Gemma 3, a new family of open-source AI models designed for efficient and on-device AI applications. Gemma 3 models are built with technology similar to Gemini 2.0, intended to run efficiently on a single GPU or TPU. The models are available in various sizes: 1B, 4B, 12B, and 27B parameters, with options for both pre-trained and instruction-tuned variants, allowing users to select the model that best fits their hardware and specific application needs.

Gemma 3 offers practical advantages including efficiency and portability. For example, the 27B version has demonstrated robust performance in evaluations while still being capable of running on a single GPU. The 4B, 12B, and 27B models are capable of processing both text and images, and supports more than 140 languages. The models have a context window of 128,000 tokens, making them well suited for tasks that require processing large amounts of information. Google has built safety protocols into Gemma 3, including a safety checker for images called ShieldGemma 2.

Recommended read:
References :
  • MarkTechPost: Google AI Releases Gemma 3: Lightweight Multimodal Open Models for Efficient and On‑Device AI
  • The Official Google Blog: Introducing Gemma 3: The most capable model you can run on a single GPU or TPU
  • AI News | VentureBeat: Google unveils open source Gemma 3 model with 128k context window
  • AI News: Details on the launch of Gemma 3 open AI models by Google.
  • The Verge: Google calls Gemma 3 the most powerful AI model you can run on one GPU
  • Maginative: Google DeepMind’s Gemma 3 Brings Multimodal AI, 128K Context Window, and More
  • TestingCatalog: Gemma 3 sets new benchmarks for open compact models with top score on LMarena
  • AI & Machine Learning: Announcing Gemma 3 on Vertex AI
  • Analytics Vidhya: Gemma 3 vs DeepSeek-R1: Is Google’s New 27B Model a Tough Competition to the 671B Giant?
  • AI & Machine Learning: How to deploy serverless AI with Gemma 3 on Cloud Run
  • The Tech Portal: Google rolls outs Gemma 3, its latest collection of lightweight AI models
  • eWEEK: Google’s Gemma 3: Does the ‘World’s Best Single-Accelerator Model’ Outperform DeepSeek-V3?
  • The Tech Basic: Gemma 3 by Google: Multilingual AI with Image and Video Analysis
  • Analytics Vidhya: Google’s Gemma 3: Features, Benchmarks, Performance and Implementation
  • InfoWorld: Google unveils Gemma 3 multi-modal AI models
  • www.zdnet.com: Google claims Gemma 3 reaches 98% of DeepSeek's accuracy - using only one GPU
  • AIwire: Google unveiled open source Gemma 3, is multimodal, comes in four sizes and can now handle more information and instructions thanks to a larger context window. The post appeared first on .
  • Ars OpenForum: Google’s new Gemma 3 AI model is optimized to run on a single GPU
  • THE DECODER: Google DeepMind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.
  • Gradient Flow: Gemma 3: What You Need To Know
  • Interconnects: Gemma 3, OLMo 2 32B, and the growing potential of open-source AI
  • OODAloop: Gemma 3, Google's newest lightweight, open-source AI model, is designed for multimodal tasks and efficient deployment on various devices.
  • NVIDIA Technical Blog: Google has released lightweight, multimodal, multilingual models called Gemma 3. The models are designed to run efficiently on phones and laptops.
  • LessWrong: Google DeepMind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.

ChinaTechNews.com Staff@ChinaTechNews.com //
Clone Robotics, a Polish startup, has unveiled Protoclone, a human-like robot designed with synthetic muscles and a skeletal structure mimicking human anatomy. The Protoclone is described as a faceless android with over 200 degrees of freedom, more than 1,000 artificial muscle fibers (Myofibers), and 500 sensors. According to the company, Protoclone operates with systems that closely replicate human muscular, skeletal, vascular, and nervous systems, even featuring 206 bones made from polymers.

Reactions to the Protoclone have been mixed, with some social media users expressing unease over its lifelike movements and appearance. Clone Robotics plans to make the android available for preorder later this year. The company envisions the robot could perform household chores in the future. Protoclone uses a pneumatic system to control its artificial muscles but the final product will use hydraulics. The robot uses a system of sensors linked to four cameras to navigate.

Recommended read:
References :
  • arstechnica.com: Dangling, twitching human robot with synthetic muscles makes its debut
  • readwrite.com: World’s first bipedal musculoskeletal android, Protoclone, unveiled by Clone Robotics
  • www.rdworldonline.com: Protoclone V1: 1000 artificial muscles power this sweating robot’s human-like moves
  • End Time Headlines: World-first humanoid robot ‘Protoclone’ with ‘muscles & bones’ twitches & spasms into life
  • www.computerworld.com: The ‘Protoclone’ robot has synthetic muscles — and moves like a human
  • Future Leap: World’s Most Human-Like Robot Is Here—And It’s Creeping Everyone Out
  • NextBigFuture.com: Poland Clone Robotics Make Liquid Muscled Synthetic Copy of the Human Body
  • poliverso.org: Poland Clone Robotics Make Liquid Muscled Synthetic Copy of the Human Body

@www.therobotreport.com //
Figure AI has unveiled Helix, a new AI system designed to power humanoid robots with advanced movement capabilities. Helix aims to enable robots to perform complex tasks and handle unfamiliar objects through voice commands. The system combines a 7-billion-parameter multimodal language model, which acts as the robot's "brain" by processing speech and visual data, with an 80-million-parameter AI that translates these instructions into precise movements.

Helix can control 35 degrees of freedom, from individual finger movements to torso control. Demonstrations have showcased robots responding to voice commands, identifying objects, and grasping them accurately, even collaborating to place food items into a refrigerator without specific prior training. The system required only 500 hours of training data and runs on embedded GPUs within the robots.

Figure AI is also reportedly seeking $1.5 billion in funding, potentially valuing the company at $39.5 billion. This comes after the company ended its collaboration with OpenAI on robot-specific AI models, though OpenAI remains a significant investor. Figure CEO Brett Adcock believes Helix is crucial for scaling robots in household settings, allowing them to adapt to new situations without requiring constant reprogramming.

Recommended read:
References :

@www.reuters.com //
Meta is expanding its artificial intelligence research into the realm of humanoid robotics, aiming to develop AI-driven software and sensors. This initiative focuses on creating intelligent machines that can interact with the physical world, potentially powering consumer robots. The company's efforts are concentrated on "embodied AI," which combines intelligence with real-world interactions, enabling robots to move, sense, and make decisions in three-dimensional environments.

Meta is not initially planning to release its own branded robots. Instead, the company is concentrating on developing AI-powered software and sensor technology that can be utilized by other robotics manufacturers. This strategy positions Meta alongside tech giants like Tesla, Apple, and Google, all of which are also investing in the robotics sector. Meta is also prioritizing user data protection by using source code analysis to detect and prevent unauthorized data scraping across its platforms, including Facebook, Instagram, and Reality Labs.

Recommended read:
References :

@oodaloop.com //
References: OODAloop , TechCrunch ,
Symbotic, a robotics firm, has announced it is acquiring Walmart’s automation division for an initial payment of $200 million, with the potential for an additional $350 million based on the deal's performance. This acquisition solidifies Symbotic's role as a key technology supplier to Walmart and significantly expands its presence in the automation market. The agreement sees Walmart purchasing automation systems from Symbotic, which avoids the prospect of Walmart developing rival automation solutions in-house. The partnership is not just an aquisition, it is expected to provide Symbotic with immediate revenue and strengthens an already established relationship with the retail giant.

The deal will see Symbotic take control of automating Walmart’s pickup and delivery centers, with Walmart funding a development program worth $520 million to support the technology. This includes the initial $200m, with the remainder in future payments. Walmart aims to improve customer experience through quicker, more efficient service with the implementation of Symbotic's automated systems. This strategic move builds on a relationship that began in 2017 and positions Symbotic to enhance Walmart's in-store Accelerated Pickup and Delivery capabilities. The transaction is projected to close in the second quarter of 2025.

Recommended read:
References :
  • OODAloop: Walmart Sells Robotics Business to Symbotic in Play to Further Automate Systems
  • TechCrunch: Symbotic set to take over Walmart’s robotics business
  • Quartz: Symbotic, a robotics company, has acquired Walmart's automation division, signaling a significant investment in automation by Walmart and a major win for Symbotic.

@www.digitimes.com //
Nvidia is significantly advancing its role in humanoid robotics through its Jetson Thor platform, aiming to replicate its AI chip success. CEO Jensen Huang has identified humanoid robots as a key area for large-scale production. To further this goal, Nvidia is releasing Isaac GR00T Blueprint which helps with generating training data for humanoid robots. This blueprint leverages workflows for synthetic data creation and simulation, and includes tools like the GR00T-Teleop, GR00T-Mimic and GR00T-Gen to generate large training datasets for robots to learn imitation skills. The company is also releasing a collection of robot foundation models and data pipelines to accelerate development.

Foxconn is partnering with Nvidia to develop humanoid robot services, integrating Nvidia's advanced software and hardware in Kaohsiung City, Taiwan. This partnership marks a significant move for Foxconn to diversify beyond contract manufacturing in electronics, with plans to collaborate on smart city projects. Additionally, Taiwan is building a supercomputer based on NVIDIA’s Blackwell architecture, further emphasizing the island's commitment to AI advancement and robotic capabilities. NVIDIA's Jetson Thor computing system, debuting in early 2025, will be central to this endeavor. The company's actions highlight a broader trend towards embodied AI.

Recommended read:
References :
  • blogs.nvidia.com: Over the next two decades, the market for humanoid robots is expected to reach $38 billion.
  • analyticsindiamag.com: Foxconn, NVIDIA Partner to Develop Humanoid Robot Service
  • www.digitimes.com: Nvidia's robot computing platform Jetson Thor aims to replicate AI chip success story

@analyticsindiamag.com //
Nvidia is aggressively expanding its presence in the robotics sector, highlighted by the development of its Jetson Thor computing platform. This platform is designed to replicate Nvidia’s success in AI chips and is a core component of their strategy to develop advanced humanoid robots. Furthermore, Nvidia is not working alone in this endeavor. They have partnered with Foxconn to create humanoid robots, aiming to move beyond just manufacturing and integrate into new tech areas. This strategic move demonstrates Nvidia’s focus on becoming a dominant player in AI-driven robotics, specifically for humanoid technology.

Nvidia is also addressing the challenge of training these robots through their Isaac GR00T Blueprint, unveiled at CES. This blueprint utilizes synthetic data generation to create the extensive datasets needed for imitation learning, allowing robots to mimic human actions. A new workflow uses Apple Vision Pro to capture human actions in a digital twin and the data is used in the Isaac Lab framework, which teaches robots to move and interact safely. Nvidia’s Cosmos platform also is in use by generating physics-aware videos that are also used to train robots. The company's CEO, Jensen Huang, emphasizes humanoid robots as the next big leap in AI innovation, aiming to establish Nvidia as a key player in the future of robotics and autonomous systems.

Recommended read:
References :
  • www.digitimes.com: Nvidia's robot computing platform Jetson Thor aims to replicate AI chip success story
  • analyticsindiamag.com: Foxconn, NVIDIA Partner to Develop Humanoid Robot Service
  • blogs.nvidia.com: NVIDIA Announces Isaac GR00T Blueprint to Accelerate Humanoid Robotics Development