Michael Nuñez@venturebeat.com
//
Nvidia has made significant strides in enhancing robot training and AI capabilities, unveiling innovative solutions at its GTC conference. A key announcement was Cosmos-Transfer1, a groundbreaking AI model designed to generate photorealistic simulations for training robots and autonomous vehicles. This model bridges the gap between virtual training environments and real-world applications by using multimodal inputs to create highly realistic simulations. This adaptive multimodal control system allows developers to weight different visual inputs, such as depth or object boundaries, to improve the realism and utility of the generated environments.
Nvidia also introduced its next-generation GPU superchips, including the second generation of the Grace Blackwell chip and the Vera Rubin, expected in the second half of 2026. The Vera Rubin will feature 288GB of high-bandwidth memory 4 (HBM4) and will be paired with CPUs boasting 88 custom Arm cores. These new chips promise substantial increases in compute capacity, with Rubin delivering a 900x speedup compared to the previous generation Hopper chips. This positions Nvidia to tackle the increasing demands of generative AI workloads, including training massive AI models and running inference workloads. References :
Classification:
@Google DeepMind Blog
//
Google is pushing the boundaries of AI and robotics with its Gemini AI models. Gemini Robotics, an advanced vision-language-action model, now enables robots to perform physical tasks with improved generalization, adaptability, and dexterity. This model interprets and acts on text, voice, and image data, showcasing Google's advancements in integrating AI for practical applications. Furthermore, the development of Gemini Robotics-ER, which incorporates embodied reasoning capabilities, signifies another step toward smarter, more adaptable robots.
Google's approach to robotics emphasizes safety, employing both physical and semantic safety systems. The company is inviting filmmakers and creators to experiment with the model to improve the design and development. Veo builds on years of generative video model work, including Generative Query Network(GQN),DVD-GAN,Imagen-Video,Phenaki,WALT,VideoPoetandLumiere— combining architecture, scaling laws and other novel techniques to improve quality and output resolution. References :
Classification: |
BenchmarksBlogsResearch Tools |