News from the AI & ML world

DeeperML - #nvidia

@blogs.nvidia.com //
References: NVIDIA Blog , TechNode
NVIDIA CEO Jensen Huang captivated an audience of over 4,000 at Computex 2025 in Taipei, outlining a vision of a technology revolution driven by artificial intelligence. Huang declared that AI is becoming a fundamental infrastructure, akin to electricity and the internet. He envisioned an AI infrastructure industry worth trillions of dollars. To support this growing demand, Huang detailed NVIDIA's latest innovations, from Grace Blackwell NVL72 systems to advanced networking technology, emphasizing that these are "AI factories" that produce valuable tokens by applying energy.

Huang emphasized the collaborative nature of this AI revolution, highlighting NVIDIA's partners across Taiwan and the world. He showcased NVIDIA's CUDA-X platform and its widespread applications across various industries, including the building of 6G with AI and advancements in quantum supercomputing. He stressed that a larger install base leads to more developers, libraries, and ultimately, more amazing applications and benefits for users. Huang also presented concepts like agentic AI and physical AI, illustrating AI's growing ability to reason, perceive, and understand the world, ultimately leading to general robotics.

In a strategic move to maintain its dominance in the AI chip market, NVIDIA is reportedly planning to establish a new R&D center in Shanghai. This initiative aims to navigate growing sales challenges in China and tailor solutions for Chinese clients while also contributing to global research and development efforts. The center will focus on areas such as chip design verification, product optimization, and autonomous driving technologies. This plan was reportedly proposed by CEO Jensen Huang during a recent meeting with Shanghai Mayor Gong Zheng, and the Shanghai government has apparently expressed initial support for the project. NVIDIA currently employs about 2,000 people in Shanghai, primarily in sales and support roles.

Recommended read:
References :
  • NVIDIA Blog: NVIDIA CEO Jensen Huang took the stage at a packed Taipei Music Center Monday to kick off COMPUTEX 2025, captivating the audience of more than 4,000 with a vision for a technology revolution that will sweep every country
  • TechNode: NVIDIA is planning to establish a new R&D center in Shanghai in an effort to sustain its leadership in AI chips amid growing sales challenges in China.

@thetechbasic.com //
Nvidia CEO Jensen Huang has denied allegations of AI chip diversion to China, stating there is no evidence of any illegal flow of hardware. Speaking in Taipei, Huang addressed global concerns regarding the transfer of advanced technology to restricted markets. He emphasized the impracticality of covertly shipping Nvidia’s large and complex systems, such as the Grace Blackwell, which weighs nearly two tons and comprises numerous high-powered GPUs and processors.

Huang highlighted the strict oversight by Nvidia's customers, which include major tech giants like Microsoft, Amazon, Alphabet, and Meta Platforms. These customers are fully aware of and compliant with US export controls, as losing access to Nvidia's technology would be a significant setback for them. The CEO underscored the importance of trust and transparency in Nvidia's supply chain, emphasizing the value customers place on maintaining access to Nvidia's cutting-edge AI solutions.

In other news, Nvidia is reportedly planning to establish a new research and development center in Shanghai to further solidify its position in the AI chip market. This initiative aims to tailor solutions specifically for Chinese clients while also contributing to global R&D efforts in areas such as chip design verification, product optimization, and autonomous driving technologies. The Shanghai government has reportedly expressed initial support for this project, which would expand Nvidia's presence in the city beyond its current sales and support operations.

Recommended read:
References :
  • thetechbasic.com: No Signs of AI Chip Diversion to China Says Nvidia CEO Amid Global Concerns
  • Bloomberg Technology: Nvidia CEO Sees No Evidence of AI Chip Diversion Into China
  • The Tech Basic: No Signs of AI Chip Diversion to China Says Nvidia CEO Amid Global Concerns

staff@insideAI News //
Saudi Arabia is making major strides in artificial intelligence, unveiling deals with several leading U.S. technology firms including NVIDIA, AMD, Cisco, and Amazon Web Services. These partnerships are primarily formed through HUMAIN, the AI subsidiary of Saudi Arabia’s Public Investment Fund (PIF), which controls about $940 billion in assets. As part of these collaborations, Saudi Arabia’s Crown Prince Mohammed bin Salman has launched ‘Humain’ with the intent of establishing the kingdom as a global leader in artificial intelligence. This initiative aligns with the Kingdom’s Vision 2030 plan to diversify its economy and reduce dependence on oil revenues.

NVIDIA has partnered with HUMAIN to construct AI factories in Saudi Arabia. The partnership underscores HUMAIN’s mission to position Saudi Arabia as an international AI powerhouse, and will have a projected capacity of up to 500 megawatts. The initial phase includes the deployment of 18,000 NVIDIA GB300 Grace Blackwell AI supercomputers with NVIDIA InfiniBand networking. AMD has also signed an agreement with HUMAIN where the parties will invest up to $10 billion to deploy 500 megawatts of AI compute capacity over the next five years.

In addition to chip manufacturers, networking and cloud service providers are also involved. Cisco will partner with HUMAIN AI enterprise to power AI infrastructure and ecosystem growth, with new investments in research, talent, and digital skills. Amazon Web Services (AWS) and HUMAIN plan to invest over $5 billion to build an “AI Zone” in the kingdom, incorporating dedicated AWS AI infrastructure and services. These efforts are supported by the U.S. government easing AI chip export rules to Gulf states, which had previously limited the access of such countries to high-end AI chips.

Recommended read:
References :
  • insideAI News: Saudi Arabia Unveils AI Deals with NVIDIA, AMD, Cisco, AWS
  • THE DECODER: Saudi Arabia founds AI company "Humain" - US relaxes chip export rules for Gulf states
  • the-decoder.com: Saudi Arabia founds AI company "Humain" - US relaxes chip export rules for Gulf states
  • www.theguardian.com: Reports on deals by US tech firms, including Nvidia and Cisco, to expand AI capabilities in Saudi Arabia and the UAE.
  • Maginative: Saudi Arabia’s Crown Prince Mohammed bin Salman has launched ‘Humain’, a state-backed AI company aimed at establishing the kingdom as a global leader in artificial intelligence, coinciding with a major investment forum attracting top U.S. tech executives.
  • Analytics India Magazine: NVIDIA to Deploy 18,000 Chips for AI Data Centres in Saudi Arabia.
  • insidehpc.com: NVIDIA announced a partnership with HUMAIN, the AI subsidiary of Saudi Arabia’s Public Investment Fund, to build AI factories in the kingdom. HUMAIN said the partnership will develop a projected capacity of up to 500 megawatts powered by several hundred thousand of ....
  • insidehpc.com: NVIDIA in Partnership to Build AI Factories in Saudi Arabia
  • www.nextplatform.com: Saudi Arabia Has The Wealth – And Desire – To Become An AI Player
  • THE DECODER: Nvidia will supply advanced chips for Saudi Arabia’s Humain AI project
  • MarkTechPost: NVIDIA AI Introduces Audio-SDS: A Unified Diffusion-Based Framework for Prompt-Guided Audio Synthesis and Source Separation without Specialized Datasets
  • www.artificialintelligence-news.com: Saudi Arabia’s new state subsidiary, HUMAIN, is collaborating with NVIDIA to build AI infrastructure, nurture talent, and launch large-scale digital systems.
  • techxplore.com: Saudi Arabia has big AI ambitions. They could come at the cost of human rights

@developer.nvidia.com //
NVIDIA is making strides in accelerating scientific research and adapting to changing global regulations. The company is focusing on battery innovation through the development of specialized Large Language Models (LLMs) with advanced reasoning capabilities. These models, exemplified by SES AI's Molecular Universe LLM, a 70B parameter model, are designed to overcome the limitations of general-purpose LLMs by incorporating domain-specific knowledge and terminology. This approach significantly enhances performance in specialized fields, enabling tasks such as hypothesis generation, chain-of-thought reasoning, and self-correction, which are critical for driving material exploration and boosting expert productivity.

NVIDIA is also navigating export control rules by preparing a cut-down version of its HGX H20 AI processor for the Chinese market. This strategic move aims to maintain access to this crucial market while adhering to updated U.S. export regulations that effectively barred the original version. The downgraded AI GPU will feature reduced HBM memory capacity to comply with the newly imposed technical limits. This adjustment ensures that NVIDIA remains within the permissible thresholds set by the U.S. government, reflecting the company's commitment to complying with international trade laws while continuing to serve its global customer base.

In addition to its work on battery research and regulatory compliance, NVIDIA has introduced Audio-SDS, a unified diffusion-based framework for prompt-guided audio synthesis and source separation. This innovative framework leverages a single pretrained model to perform various audio tasks without requiring specialized datasets. By adapting Score Distillation Sampling (SDS) to audio diffusion, NVIDIA is enabling the optimization of parametric audio representations, uniting signal-processing interpretability with the flexibility of modern diffusion-based generation. This technology promises to advance audio synthesis and source separation by integrating data-driven priors with explicit parameter control, producing perceptually compelling results.

Recommended read:
References :
  • developer.nvidia.com: Scientific research in complex fields like battery innovation is often slowed by manual evaluation of materials, limiting progress to just dozens of candidates...
  • www.tomshardware.com: Nvidia plans to launch a downgraded HGX H20 AI processor with reduced HBM memory capacity for China by July to comply with new U.S. export rules, if a new rumor is correct.
  • www.marktechpost.com: Audio diffusion models have achieved high-quality speech, music, and Foley sound synthesis, yet they predominantly excel at sample generation rather than parameter optimization.

@blogs.nvidia.com //
Cadence has unveiled the Millennium M2000 Supercomputer, a powerhouse featuring NVIDIA Blackwell systems, aimed at revolutionizing AI-driven engineering design and scientific simulations. This supercomputer integrates NVIDIA HGX B200 systems and NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, coupled with NVIDIA CUDA-X software libraries and Cadence's optimized software. The result is a system capable of delivering up to 80 times higher performance compared to its CPU-based predecessors, marking a significant leap forward in computational capability for electronic design automation, system design, and life sciences workloads.

This collaboration between Cadence and NVIDIA is set to enable engineers to conduct massive simulations, leading to breakthroughs in various fields, including the design and development of autonomous machines, drug molecules, semiconductors, and data centers. NVIDIA's founder and CEO, Jensen Huang, highlighted the transformative potential of AI, stating that it will infuse every aspect of business and product development. Huang also announced NVIDIA's plans to acquire ten Millennium Supercomputer systems based on the NVIDIA GB200 NVL72 platform to accelerate the company’s chip design workflows, emphasizing the importance of this technology for NVIDIA's future endeavors.

In related news, the open-source OpenSearch software has launched its 3.0 version, which includes GPU acceleration to enhance AI workloads through its new OpenSearch Vector Engine. This update leverages NVIDIA GPUs to improve search performance with large-scale vector workloads and reduce index build times, aiming to address scalability issues common in vector databases. OpenSearch 3.0 also supports Anthropic PBC’s Model Context Protocol, facilitating the integration of large language models with external data. The Millennium M2000 Supercomputer harnesses accelerated software from NVIDIA and Cadence for applications including circuit simulation, computational fluid dynamics, data center design and molecular design.

Recommended read:
References :
  • NVIDIA Newsroom: Cadence Taps NVIDIA Blackwell to Accelerate AI-Driven Engineering Design and Scientific Simulation
  • www.networkworld.com: Cadence debuts Nvidia-powered supercomputer to accelerate enterprise engineering, biotech
  • insidehpc.com: Cadence Unveils Millennium M2000 Supercomputer with NVIDIA Blackwell Systems
  • Ken Yeung: ServiceNow and Nvidia Debut Apriel Nemotron 15B, an Open-Source Reasoning Model Built for Faster, Cheaper Agentic AI

Coen van@Techzine Global //
ServiceNow has announced the launch of AI Control Tower, a centralized control center designed to manage, secure, and optimize AI agents, models, and workflows across an organization. Unveiled at Knowledge 2025 in Las Vegas, this platform provides a holistic view of the entire AI ecosystem, enabling enterprises to monitor and manage both ServiceNow and third-party AI agents from a single location. The AI Control Tower aims to address the growing complexity of managing AI deployments, giving users a central point to see all AI systems, their deployment status, and ensuring governance and understanding of their activities.

The AI Control Tower offers key benefits such as enterprise-wide AI visibility, built-in compliance and AI governance, end-to-end lifecycle management of agentic processes, real-time reporting, and improved alignment. It is designed to help AI systems administrators and other stakeholders monitor and manage every AI agent, model, or workflow within their system, providing real-time reporting for different metrics and embedded compliance and AI governance. The platform helps users understand the different systems by provider and type, improving risk and compliance management.

In addition to the AI Control Tower, ServiceNow introduced AI Agent Fabric, facilitating communication between AI agents and partner integrations. ServiceNow has also partnered with NVIDIA to engineer an open-source model, Apriel Nemotron 15B, designed to drive advancements in enterprise large language models (LLMs) and power AI agents that support various enterprise workflows. The Apriel Nemotron 15B, developed using NVIDIA NeMo and ServiceNow domain-specific data, is engineered for reasoning, drawing inferences, weighing goals, and navigating rules in real time, making it efficient and scalable for concurrent enterprise workflows.

Recommended read:
References :
  • thenewstack.io: Given that ServiceNow is, at its core, all about automating workflows for enterprises, it’s no surprise that
  • AI News | VentureBeat: ServiceNow also announced a way for agents to communicate with others along with its new observability platform.
  • Techzine Global: During Knowledge 2025 , ServiceNow launched AI Control Tower, a centralized control center for managing, securing, and optimizing AI agents, models, and workflows.
  • NVIDIA Blog: Your Service Teams Just Got a New Coworker — and It’s a 15B-Parameter Super Genius Built by ServiceNow and NVIDIA
  • www.zdnet.com: ServiceNow and Nvidia's new reasoning AI model raises the bar for enterprise AI agents
  • www.networkworld.com: ServiceNow unveiled a centralized command center the company says will enable enterprise customers to govern, manage, and secure AI agents from ServiceNow and other third-parties from a unified platform.
  • www.computerworld.com: Nvidia and ServiceNow have created an AI model that can help companies create learning AI agents to automate corporate workloads. The open-source Apriel model, available generally in the second quarter on HuggingFace, will help create AI agents that can make decisions around IT, human resources and customer-service functions.
  • blogs.nvidia.com: ServiceNow is accelerating enterprise AI with a new reasoning model built in partnership with NVIDIA — enabling AI agents that respond in real time, handle complex workflows and scale functions like IT, HR and customer service teams worldwide.
  • NVIDIA Newsroom: ServiceNow is accelerating enterprise AI with a new reasoning model built in partnership with NVIDIA — enabling AI agents that respond in real time, handle complex workflows and scale functions like IT, HR and customer service teams worldwide.
  • techstrong.ai: ServiceNow Inc. kicked off its annual artificial intelligence (AI) conference in Las Vegas Tuesday as it has in previous years -- with a fusillade of product announcements, partnerships and customer stories.
  • techstrong.ai: ServiceNow’s New AI Control Tower Commands AI Agents
  • Ken Yeung: ServiceNow Debuts AI Control Tower to Manage the Chaos of Enterprise AI Agents
  • Ken Yeung: ServiceNow and Nvidia have had a long-standing partnership building generative AI solutions for the enterprise. This week, at ServiceNow’s Knowledge customer conference, the two are introducing the latest fruits of their labor, a new large language model called Apriel Nemotron 15B with reasoning capabilities.
  • CIO Dive - Latest News: ServiceNow, Nvidia develop LLM to fuel enterprise agents
  • AI News: ServiceNow bets on unified AI to untangle enterprise complexity
  • www.artificialintelligence-news.com: ServiceNow bets on unified AI to untangle enterprise complexity
  • www.marktechpost.com: ServiceNow AI Released Apriel-Nemotron-15b-Thinker: A Compact Yet Powerful Reasoning Model Optimized for Enterprise-Scale Deployment and Efficiency

@venturebeat.com //
Nvidia has launched Parakeet-TDT-0.6B-V2, a fully open-source transcription AI model, on Hugging Face. This represents a new standard for Automatic Speech Recognition (ASR). The model, boasting 600 million parameters, has quickly topped the Hugging Face Open ASR Leaderboard with a word error rate of just 6.05%. This level of accuracy positions it near proprietary transcription models, such as OpenAI’s GPT-4o-transcribe and ElevenLabs Scribe, making it a significant advancement in open-source speech AI. Parakeet operates under a commercially permissive CC-BY-4.0 license.

The speed of Parakeet-TDT-0.6B-V2 is a standout feature. According to Hugging Face’s Vaibhav Srivastav, it can "transcribe 60 minutes of audio in 1 second." Nvidia reports this is achieved with a real-time factor of 3386, meaning it processes audio 3386 times faster than real-time when running on Nvidia's GPU-accelerated hardware. This speed is attributed to its transformer-based architecture, fine-tuned with high-quality transcription data and optimized for inference on NVIDIA hardware using TensorRT and FP8 quantization. The model also supports punctuation, capitalization, and detailed word-level timestamping.

Parakeet-TDT-0.6B-V2 is aimed at developers, researchers, and industry teams building various applications. This includes transcription services, voice assistants, subtitle generators, and conversational AI platforms. Its accessibility and performance make it an attractive option for commercial enterprises and indie developers looking to build speech recognition and transcription services into their applications. With its release on May 1, 2025, Parakeet is set to make a considerable impact on the field of speech AI.

Recommended read:
References :
  • Techmeme: Nvidia launches open-source transcription model Parakeet-TDT-0.6B-V2, topping the Hugging Face Open ASR Leaderboard with a word error rate of 6.05% (Carl Franzen/VentureBeat)
  • @techmeme.com - Techmeme: Nvidia launches open-source transcription model Parakeet-TDT-0.6B-V2, topping the Hugging Face Open ASR Leaderboard with a word error rate of 6.05% (Carl Franzen/VentureBeat)
  • venturebeat.com: An attractive proposition for commercial enterprises and indie developers looking to build speech recognition and transcription services...
  • www.marktechpost.com: NVIDIA Open Sources Parakeet TDT 0.6B: Achieving a New Standard for Automatic Speech Recognition ASR and Transcribes an Hour of Audio in One Second
  • AI News | VentureBeat: Reports Nvidia launches fully open source transcription AI model Parakeet-TDT-0.6B-V2 on Hugging Face
  • MarkTechPost: Reports NVIDIA Open Sources Parakeet TDT 0.6B: Achieving a New Standard for Automatic Speech Recognition ASR and Transcribes an Hour of Audio in One Second
  • www.eweek.com: NVIDIA’s AI Transcription Tool Produces 60 Minutes of Text in 1 Second
  • eWEEK: NVIDIA has released a new version of its Parakeet transcription tool, boasting the lowest error rate of any of its competitors. In addition, the company made the code public on GitHub. Parakeet TDT 0.6B is a 600-million-parameter automatic speech recognition model. It can transcribe 60 minutes of audio per second, Hugging Face data scientist Vaibhav […]

ashilov@gmail.com (Anton@tomshardware.com //
Nvidia's CEO, Jensen Huang, has stated that China is rapidly catching up to the U.S. in artificial intelligence capabilities. Huang emphasized that China isn't far behind, particularly in AI hardware development, where companies like Huawei are making significant strides. Huawei's advancements, including its Ascend 900-series AI accelerators and CloudMatrix 384 systems, demonstrate China's growing competitiveness. The CloudMatrix 384, featuring 384 dual-chiplet HiSilicon Ascend 910C interconnected via an optical mesh network, offers impressive computing power and memory bandwidth, rivaling Nvidia's offerings, though with lower efficiency. Huang acknowledged Huawei as a formidable technology company with incredible computing, networking, and software capabilities essential for advancing AI.

New tools are emerging to empower artists to harness the power of AI in image generation. NVIDIA has introduced a 3D Guided Generative AI Blueprint that provides a workflow enabling artists to precisely control object placement and camera angles using a 3D scene in Blender. This tackles the common challenge of achieving the desired composition and layout in AI-generated images. This AI Blueprint is pre-optimized for NVIDIA and GeForce RTX GPUs, built on NVIDIA NIM microservices to maximize AI model performance. The process involves converting a 3D viewport to a depth map, which guides the image generator (FLUX.1-dev) along with a user prompt.

For those looking to enter the AI job market, NVIDIA experts share several key tips. A diverse educational and professional background can be a valuable asset, enabling adaptability in the rapidly evolving AI field. Integrating AI into daily workflows, regardless of one's background, can help individuals stand out. It's also crucial to identify your passions within AI and gain experience in relevant domains such as autonomous vehicles, robotics, gaming, or healthcare. By aligning skills with specific AI applications, candidates can better position themselves for success.

Recommended read:
References :
  • www.tomshardware.com: Nvidia's CEO says China is not far behind the U.S. in AI capabilities
  • blogs.nvidia.com: As AI use cases continue to expand — from document summarization to custom software agents — developers and enthusiasts are seeking faster, more flexible ways to run large language models (LLMs). Running models locally on PCs with NVIDIA GeForce RTX GPUs enables high-performance inference, enhanced data privacy and full control over AI deployment and integration.

ashilov@gmail.com (Anton@tomshardware.com //
Nvidia CEO Jensen Huang has expressed concern about the rising competition from Huawei in the artificial intelligence hardware sector. Huang admitted that he is fearful of Huawei, acknowledging the company's significant progress in computing, networking technology, and software capabilities, all essential for advancing AI. He noted that China is not far behind the U.S. in AI capabilities, almost on par, particularly in AI hardware development. Huang's comments came during the Hill and Valley Forum, where business leaders and lawmakers discussed technology and national security.

China's advancements in AI hardware are driven by numerous companies, with Huawei leading the pack. Huawei's AI strategy encompasses everything from its Ascend 900-series AI accelerators to servers and rack-scale solutions for cloud data centers. The company recently unveiled CloudMatrix 384, a system packing 384 dual-chiplet HiSilicon Ascend 910C processors interconnected using a fully optical mesh network. Huawei has already sold over ten CloudMatrix 384 systems to Chinese customers, indicating a growing interest in domestic alternatives to Nvidia hardware.

The CloudMatrix 384 system spans 16 racks and achieves roughly 300 PFLOPs of dense BF16 compute, nearly double Nvidia's GB200 NVL72. While it offers superior memory bandwidth and HBM capacity, it consumes more power per FLOP. Despite these differences, Huang recognized Huawei as one of the most formidable technology companies in the world, highlighting their incredible progress in recent years and the potential threat they pose to Nvidia's dominance in the AI hardware market.

Recommended read:
References :
  • R. Scott Raynovich: Nvidia CEO Jensen Huang made a remarkable admission this week: He’s fearful of Huawei.
  • www.tomshardware.com: Jensen Huang states that China is nearly on par with the U.S. in AI hardware development, as Huawei begins shipping its CloudMatrix 384 systems.

@blogs.nvidia.com //
Nvidia is currently facing pressure from the U.S. government regarding AI GPU export rules. CEO Jensen Huang has been advocating for the Trump administration to relax these restrictions, arguing they hinder American companies' ability to compete in the global market. Huang stated at the Hill and Valley Forum that China is not far behind the U.S. in AI capabilities, emphasizing the need to accelerate the diffusion of American AI technology worldwide. He also acknowledged Huawei's progress in computing, networking, and software, noting their development of the CloudMatrix 384 system. This system, powered by Ascend 910C accelerators, is considered competitive with Nvidia's GB200 NVL72, signaling the emergence of domestic alternatives in China.

Despite Nvidia's pleas, the Trump administration is considering tighter controls on AI GPU exports. The administration plans to use chip access as leverage in trade negotiations with other nations. This approach contrasts with Nvidia's view that restricting exports will only fuel the development of competing hardware and software in countries like China. According to the AI Diffusion framework, access to advanced AI chips like Nvidia’s H100 is only unrestricted for companies based in the U.S. and "Tier 1" nations, while those in "Tier 2" nations face annual limits and "Tier 3" countries are effectively barred.

Adding to the complexity, Nvidia is also engaged in a public dispute with AI startup Anthropic over the export restrictions. Anthropic has endorsed the Biden-era "AI Diffusion Rule" and has claimed there has been chip smuggling to China. An Nvidia spokesperson dismissed Anthropic's claims about chip smuggling tactics as "tall tales," arguing that American firms should focus on innovation instead of trying to manipulate policy for competitive advantage. As the May 15th export controls deadline approaches, the tensions continue to rise within the AI industry over the balance between national security, economic prosperity, and global competitiveness.

Recommended read:
References :
  • AIwire: Huawei Challenges Nvidia’s AI Dominance with New Chip
  • R. Scott Raynovich: Huawei Moves Ratchet Up Nvidia’s Stakes In The AI Trade War
  • www.tomshardware.com: Nvidia asks US government to ease AI GPU export rules, but Trump administration plans tighter controls
  • blogs.nvidia.com: NVIDIA Experts Share Top 5 Tips for Standing Out in the AI Job Market
  • www.tomshardware.com: Nvidia's CEO says China is not far behind the U.S. in AI capabilities
  • NVIDIA Newsroom: NVIDIA Experts Share Top 5 Tips for Standing Out in the AI Job Market
  • Maginative: Nvidia and Anthropic have expressed conflicting views on the U.S. government's AI chip export controls, with Anthropic advocating for stricter rules to limit China's access to advanced GPUs.
  • The Register - Software: Anthropic calls for tougher GPU export controls as Nvidia's CEO implores Trump to spread the AI love

@cyberpress.org //
NVIDIA has issued a critical security update for its TensorRT-LLM framework to address a high-severity vulnerability, identified as CVE-2025-23254. This flaw poses significant risks, potentially leading to remote code execution, data tampering, and information disclosure. All platforms and versions of TensorRT-LLM prior to 0.18.2 are affected, making this update essential for users to safeguard their systems against potential attacks. The vulnerability resides in the Python executor component of TensorRT-LLM and stems from insecure handling of Inter-Process Communication (IPC).

The specific weakness lies in the Python pickle module's utilization for serialization and deserialization within the socket-based IPC system. An attacker with local access to the TRTLLM server could exploit this by injecting malicious code, gaining unauthorized access to sensitive data, or manipulating existing data. NVIDIA has assigned a CVSS base score of 8.8 to this vulnerability, classifying it as high severity, with the underlying technical risk categorized as "Deserialization of Untrusted Data" (CWE-502). Avi Lumelsky of Oligo Security is credited with responsibly reporting the vulnerability.

To mitigate this threat, NVIDIA has implemented HMAC (Hash-Based Message Authentication Code) encryption by default for all socket-based IPC operations in both the main and release branches of TensorRT-LLM. This security enhancement ensures the integrity and authenticity of serialized data exchanged between processes, preventing unauthorized code execution. NVIDIA strongly advises users not to disable this encryption feature, as doing so would reintroduce the vulnerability and leave systems vulnerable to potential attacks. Users are urged to immediately update to TensorRT-LLM version 0.18.2 or later to fully address the identified risks.

Recommended read:
References :
  • Cyber Security News: NVIDIA has released a crucial security update for its TensorRT-LLM Framework, addressing a high-severity vulnerability that could expose users to significant risks, including remote code execution, data tampering, and information disclosure. The vulnerability, tracked as CVE-2025-23254, affects all platforms and all versions of TensorRT-LLM before 0.18.2. Vulnerability Details The flaw resides in the Python executor
  • securityonline.info: NVIDIA has released a security update for its TensorRT-LLM Framework, addressing a high-severity vulnerability that could expose users The post appeared first on .
  • gbhackers.com: NVIDIA has issued an urgent security advisory after discovering a significant vulnerability (CVE-2025-23254) in its popular TensorRT-LLM framework, urging all users to update to the latest version (0.18.2) to safeguard their systems against potential attacks. Overview of the Vulnerability The vulnerability, identified as CVE-2025-23254, affects all versions of the NVIDIA TensorRT-LLM framework before 0.18.2 across

Ali Azhar@AIwire //
Nvidia CEO Jensen Huang has expressed concerns about the growing competition from Huawei in the AI chip market, a notable admission highlighting the shifting dynamics within the global tech landscape. Geopolitical tensions and tightening U.S. export controls are reshaping technology supply chains, creating both challenges and opportunities for companies worldwide. Huang has previously called for the Trump administration to relax AI GPU export restrictions to support U.S. industry; however, the administration is considering stricter controls as leverage in trade negotiations.

Huawei is actively developing its Ascend series to challenge Nvidia's dominance. The new Huawei Ascend 910D AI processor is designed to compete directly with Nvidia's Blackwell and Rubin GPUs. Huawei plans to begin testing the Ascend 910D processor, aiming for it to surpass the performance of Nvidia's H100. To achieve H100 performance levels, Huawei will have to redesign the internal architecture of the Ascend 910D and possibly increase the number of compute chiplets. The company has approached local Chinese firms to evaluate the chip's performance, with initial samples expected by late May 2025.

In response to the evolving AI landscape, Nvidia is also focusing on strengthening its cloud infrastructure. NVIDIA Blackwell GPUs are now being deployed on NVIDIA DGX Cloud and Oracle Cloud Infrastructure (OCI) to develop and run reasoning models and AI agents. Oracle Cloud Infrastructure is deploying NVIDIA Blackwell GPUs on NVIDIA DGX Cloud and Oracle Cloud Infrastructure (OCI) to develop and run reasoning models and AI agents. This move is intended to solidify Nvidia's position and provide its customers with advanced AI capabilities through cloud-based solutions.

Recommended read:
References :
  • R. Scott Raynovich: Nvidia CEO Jensen Huang made a remarkable admission this week: He’s fearful of Huawei.
  • AIwire: As geopolitical tensions reshape technology supply chains and U.S. export controls tighten, new challenges and opportunities arise that are transforming the global tech landscape.
  • thetechbasic.com: Huawei’s New AI Chip Challenges Nvidia Amid US Sanctions
  • www.tomshardware.com: Huawei Ascend AI 910D processor designed to take on Nvidia's Blackwell and Rubin GPUs

@blogs.nvidia.com //
Oracle Cloud Infrastructure (OCI) is now deploying thousands of NVIDIA Blackwell GPUs to power agentic AI and reasoning models. OCI has stood up and optimized its first wave of liquid-cooled NVIDIA GB200 NVL72 racks in its data centers, enabling customers to develop and run next-generation AI agents. The NVIDIA GB200 NVL72 platform is a rack-scale system combining 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs, delivering performance and energy efficiency for agentic AI powered by advanced AI reasoning models. Oracle aims to build one of the world's largest Blackwell clusters, with OCI Superclusters scaling beyond 100,000 NVIDIA Blackwell GPUs to meet the growing demand for accelerated computing.

This deployment includes high-speed NVIDIA Quantum-2 InfiniBand and NVIDIA Spectrum-X Ethernet networking for scalable, low-latency performance, along with software and database integrations from NVIDIA and OCI. OCI is among the first to deploy NVIDIA GB200 NVL72 systems, and this deployment marks a transformation of cloud data centers into AI factories. These AI factories are designed to manufacture intelligence at scale, leveraging the NVIDIA GB200 NVL72 platform. OCI offers flexible deployment options to bring Blackwell to customers across public, government, and sovereign clouds, as well as customer-owned data centers.

These new racks are the first systems available from NVIDIA DGX Cloud, an optimized platform with software, services, and technical support for developing and deploying AI workloads on clouds. NVIDIA will utilize these racks for various projects, including training reasoning models, autonomous vehicle development, accelerating chip design and manufacturing, and developing AI tools. In related cybersecurity news, Cisco Foundation AI has released its first open-source security model, Llama-3.1-FoundationAI-SecurityLLM-base-8B, designed to improve response time, expand capacity, and proactively reduce risk in security operations.

Recommended read:
References :
  • NVIDIA Newsroom: Oracle has stood up and optimized its first wave of liquid-cooled NVIDIA GB200 NVL72 racks in its data centers.
  • Security @ Cisco Blogs: Foundation AI's first release — Llama-3.1-FoundationAI-SecurityLLM-base-8B — is designed to improve response time, expand capacity, and proactively reduce risk.
  • insidehpc.com: Nvidia said Oracle has stood up its first wave of liquid-cooled NVIDIA GB200 NVL72 racks in its data centers.
  • www.networkworld.com: Palo Alto Networks unpacks security platform to protect AI resources

@developer.nvidia.com //
NVIDIA continues to advance the capabilities of AI across various sectors with its integrated hardware and software platforms. The NVIDIA Isaac GR00T N1 represents a significant leap in robotics, offering a hardware-software combination designed to provide humanoid robots with advanced cognitive abilities. This platform aims to bridge the gap between large language models and real-world dexterity, enabling robots to learn from single demonstrations and adapt to different tasks in diverse environments such as homes, factories, and disaster zones. Powered by the Jetson Thor chip and built on the Isaac robotics platform, GR00T N1 focuses on adaptability through foundation model reinforcement learning, allowing robots to reason and generalize insights, rather than simply reacting to pre-programmed instructions.

Agentic AI is transforming cybersecurity by introducing both new opportunities and challenges. These AI agents can autonomously interact with tools, environments, and sensitive data, enhancing threat detection, response, and overall security. Cybersecurity teams, often overwhelmed by talent shortages and high alert volumes, can leverage agentic AI to bolster their defenses. These systems can perceive, reason, and act autonomously to solve complex problems, serving as intelligent collaborators for cyber experts. Organizations such as Deloitte are utilizing NVIDIA AI Blueprints, NIM, and Morpheus to accelerate software patching and vulnerability management, while CrowdStrike and Trend Micro are leveraging NVIDIA AI software to improve security alert triaging and reduce alert fatigue.

NVIDIA NIM Operator 2.0 enhances AI deployment by supporting NVIDIA NeMo microservices, streamlining the management of inference pipelines for MLOps and LLMOps engineers. This new release simplifies the deployment, auto-scaling, and upgrading of NIM on Kubernetes clusters, building upon the capabilities of the initial NIM Operator. The NIM Operator 2.0 introduces the ability to deploy and manage the lifecycle of NVIDIA NeMo microservices, including NeMo Customizer for fine-tuning LLMs, NeMo Evaluator for comprehensive evaluation of LLMs, and NeMo Guardrails for adding safety checks and content moderation to LLM endpoints. These enhancements provide efficient model caching and boost overall operational efficiency for deploying and managing NIM on various infrastructures, as highlighted by Cisco Systems.

Recommended read:
References :
  • NVIDIA Newsroom: Agentic AI is redefining the cybersecurity landscape — introducing new opportunities that demand rethinking how to secure AI while offering the keys to addressing those challenges.
  • NVIDIA Technical Blog: The first release of NVIDIA NIM Operator simplified the deployment and lifecycle management of inference pipelines for NVIDIA NIM microservices, reducing the...
  • Sify: With the first hardware-software combo designed to give humanoid robots a brain worthy of the name, the AI revolution in robotics is officially out of beta.

Isha Salian@NVIDIA Blog //
Nvidia is pushing the boundaries of artificial intelligence with a focus on multimodal generative AI and tools to enhance AI model integration. Nvidia's research division is actively involved in advancing AI across various sectors, underscored by the presentation of over 70 research papers at the International Conference on Learning Representations (ICLR) in Singapore. These papers cover a diverse range of topics including generative AI, robotics, autonomous driving, and healthcare, demonstrating Nvidia's commitment to innovation across the AI spectrum. Bryan Catanzaro, vice president of applied deep learning research at NVIDIA, emphasized the company's aim to accelerate every level of the computing stack to amplify the impact and utility of AI across industries.

Research efforts at Nvidia are not limited to theoretical advancements. The company is also developing tools that streamline the integration of AI models into real-world applications. One notable example is the work being done with NVIDIA NIM microservices, which are being leveraged by researchers at the University College London (UCL) Deciding, Acting, and Reasoning with Knowledge (DARK) Lab to benchmark agentic LLM and VLM reasoning for gaming. These microservices simplify the deployment and scaling of AI models, enabling researchers to efficiently handle workloads of any size and customize models for specific needs.

Nvidia's NIM microservices are designed to redefine how researchers and developers deploy and scale AI models, offering a streamlined approach to harnessing the power of GPUs. These microservices simplify the process of running AI inference workloads by providing pre-optimized engines such as NVIDIA TensorRT and NVIDIA TensorRT-LLM, which deliver low-latency, high-throughput performance. The microservices also offer easy and fast API integration with standard frontends like the OpenAI API or LangChain for Python environments.

Recommended read:
References :
  • developer.nvidia.com: Researchers from the University College London (UCL) Deciding, Acting, and Reasoning with Knowledge (DARK) Lab leverage NVIDIA NIM microservices in their new research on benchmarking agentic LLM and VLM reasoning for gaming.
  • BigDATAwire: Nvidia is actively involved in research related to multimodal generative AI, including efforts to improve the reasoning capabilities of LLM and VLM models for use in gaming.

@developer.nvidia.com //
NVIDIA is significantly advancing the capabilities of AI development with the introduction of new tools and technologies. The company's latest innovations focus on enhancing the performance of AI agents, improving integration with various software and hardware platforms, and streamlining the development process for enterprises. These advancements include NVIDIA NeMo microservices for creating data-driven AI agents and a G-Assist plugin builder that enables users to customize AI functionalities on GeForce RTX AI PCs.

NVIDIA's NeMo microservices are designed to empower enterprises to build AI agents that can access and leverage data to enhance productivity and decision-making. These microservices provide a modular platform for building and customizing generative AI models, offering features such as prompt tuning, supervised fine-tuning, and knowledge retrieval tools. NVIDIA envisions these microservices as essential building blocks for creating data flywheels, enabling AI agents to continuously learn and improve from enterprise data, business intelligence, and user feedback. The initial use cases include AI agents used by AT&T to process nearly 10,000 documents and a coding assistant used by Cisco Systems.

The introduction of the G-Assist plugin builder marks a significant step forward in AI-assisted PC control. This tool allows developers to create custom commands to manage both software and hardware functions on GeForce RTX AI PCs. By enabling integration with large language models (LLMs) and other software applications, the plugin builder expands G-Assist's functionality beyond its initial gaming-focused applications. Users can now tailor AI functionalities to suit their specific needs, automating tasks and controlling various PC functions through voice or text commands. The G-Assist tool runs a lightweight language model locally on RTX GPUs, enabling inference without relying on a cloud connection.

Recommended read:
References :
  • developer.nvidia.com: Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices
  • www.tomshardware.com: NVIDIA introduces G-Assist plug-in builder, allowing its AI to integrate with LLMs and software
  • developer.nvidia.com: Benchmarking Agentic LLM and VLM Reasoning for Gaming with NVIDIA NIM
  • techstrong.ai: NVIDIA Corp. on Wednesday announced general availability of neural module (NeMo) microservices, the software tools behind artificial intelligence (AI) agents for enterprises.
  • the-decoder.com: With its G-Assist tool and a new plug-in builder, Nvidia introduces a system for AI-assisted PC control. Developers can create their own commands to manage both software and hardware functions.

@developer.nvidia.com //
NVIDIA Research is making significant strides in multimodal generative AI and robotics, as showcased at the International Conference on Learning Representations (ICLR) 2025 in Singapore. The company is focusing on a full-stack approach to AI development, optimizing everything from computing infrastructure to algorithms and applications. This approach supports various industries and tackles real-world challenges in areas like autonomous vehicles, healthcare, and robotics.

NVIDIA has introduced a new plug-in builder for G-Assist, which enables the integration of AI with large language models (LLMs) and various software programs. This allows users to customize NVIDIA's AI to fit their specific needs, expanding G-Assist's functionality by adding new commands and connecting external tools. These plug-ins can perform a wide range of functions, from connecting with LLMs to controlling music, and can be built using coding languages like JSON and Python. Developers can also submit their plug-ins for potential inclusion in the NVIDIA GitHub repository.

NVIDIA Research is also addressing the need for adaptable robotic arms in various industries with its R²D² (Robotics Research and Development Digest) workflows and models. These innovations aim to enable robots to make decisions and adjust their behavior based on real-time data, improving flexibility, safety, and collaboration in different environments. NVIDIA is developing models and workflows for dexterous grasping and manipulation, addressing challenges like handling reflective objects and generalizing to new objects and dynamic environments. DextrAH-RGB, for example, is a workflow that performs dexterous arm-hand grasping from stereo RGB input, trained at scale in simulation using NVIDIA Isaac Lab.

Recommended read:
References :
  • blogs.nvidia.com: Advancing AI requires a full-stack approach, with a powerful foundation of computing infrastructure — including accelerated processors and networking technologies — connected to optimized compilers, algorithms and applications.
  • developer.nvidia.com: Robotic arms are used today for assembly, packaging, inspection, and many more applications. However, they are still preprogrammed to perform specific and often...

@developer.nvidia.com //
NVIDIA is enhancing AI capabilities on RTX AI PCs with the introduction of a new plug-in builder for G-Assist. This innovative tool allows users to customize and expand G-Assist's functionality by integrating it with various Large Language Models (LLMs) and software applications. The plug-in builder is designed to enable users to generate AI-assisted functionalities through text and voice commands, effectively transforming G-Assist from a gaming-centric AI into a versatile tool adaptable to diverse applications, both gaming-related and otherwise.

The G-Assist plug-in builder facilitates the creation of custom commands and connections to external tools through APIs, allowing different software and services to communicate with each other. Developers can leverage coding languages like JSON and Python to create and integrate tools into G-Assist. NVIDIA has provided a GitHub repository with instructions and documentation for building and customizing these plug-ins, and users can even submit their creations for potential inclusion in the repository to share new capabilities with others. Examples of plug-in capabilities include seeking advice from AI assistants like Gemini on gaming strategies and using Twitch plug-ins to monitor streamer status via voice commands.

Furthermore, NVIDIA is advancing AI research and application across industries, demonstrated by their participation in the International Conference on Learning Representations (ICLR). NVIDIA Research presented over 70 papers at ICLR, showcasing developments in areas such as autonomous vehicles, healthcare, and multimodal content creation. Notably, researchers from University College London (UCL) are leveraging NVIDIA NIM microservices to benchmark agentic capabilities of AI models in gaming environments, highlighting the role of NIM in simplifying and accelerating the evaluation of AI reasoning in complex tasks. NIM microservices enable efficient deployment and scaling of AI models, supporting various platforms and workflows, making them a versatile solution for diverse research applications.

Recommended read:
References :
  • blogs.nvidia.com: NVIDIA Research at ICLR — Pioneering the Next Wave of Multimodal Generative AI
  • www.tomshardware.com: Nvidia introduces G-Assist plug-in builder, allowing its AI to integrate with LLMs and software

@blogs.nvidia.com //
NVIDIA's Blackwell platform is set to revolutionize data center cooling with a focus on water efficiency. The new platform introduces direct-to-chip liquid cooling, which dramatically reduces water consumption compared to traditional air-cooled systems. NVIDIA claims this innovative approach offers a 300x improvement in water efficiency. This is crucial as the increasing compute power required by AI and HPC applications is driving a global shift towards larger, more power-hungry data centers and AI factories.

This cooling solution addresses the escalating energy demands and environmental concerns associated with AI infrastructure. Historically, cooling has accounted for up to 40% of a data center's electricity consumption. The Blackwell platform's liquid cooling technology captures heat directly at the source, cycling it through a coolant distribution unit and liquid-to-liquid heat exchanger, before transferring it to a facility cooling loop. This method allows data centers to operate effectively at warmer water temperatures, reducing or eliminating the need for mechanical chillers in many climates, leading to significant cost savings and reduced energy consumption.

The NVIDIA GB200 NVL72 and GB300 NVL72 systems, built on the Blackwell platform, utilize this direct-to-chip liquid cooling technology. Unlike evaporative or immersion cooling, this is a closed-loop system, meaning the coolant doesn't evaporate or require replacement due to loss from phase change, further conserving water. These rack-scale systems are designed to handle the demanding tasks of trillion-parameter large language model inference, making them ideal for running AI reasoning models while efficiently managing energy costs and heat. This advancement not only ensures optimal performance of AI servers but also promotes a more sustainable and environmentally friendly AI infrastructure.

Recommended read:
References :
  • NVIDIA Newsroom: Chill Factor: NVIDIA Blackwell Platform Boosts Water Efficiency by Over 300x
  • blogs.nvidia.com: Chill Factor: NVIDIA Blackwell Platform Boosts Water Efficiency by Over 300x
  • www.tomshardware.com: Nvidia aims to solve AI's water consumption problems with direct-to-chip cooling — claims 300X improvement with closed-loop systems

@techstrong.ai //
Nvidia has unveiled new tools and capabilities designed to streamline AI deployment and enhance efficiency for enterprises. A key component of this release is the general availability of NVIDIA NeMo microservices, empowering companies to construct AI agents that leverage data flywheels to improve employee productivity. These microservices provide an end-to-end platform for developers, allowing them to create and continuously optimize state-of-the-art agentic AI systems through the integration of inference, business, and user feedback data. The company also highlighted the importance of maintaining a constant stream of high-quality inputs to ensure the accuracy, relevance, and timeliness of AI agents.

Nvidia is also introducing the G-Assist plug-in builder, a tool enabling customization of AI on GeForce RTX AI PCs. This plug-in builder expands the functionality of G-Assist by allowing users to add new commands and connect external tools, which can range from large language models to simple functions like controlling music. Developers can use coding languages like JSON and Python to create tools integrated into G-Assist, and they can submit plug-ins for review and potential inclusion in the NVIDIA GitHub repository, making new capabilities available to others. G-Assist can be modified to perform different actions with different LLMs and software, gaming-related or not.

Researchers at the University College London (UCL) Deciding, Acting, and Reasoning with Knowledge (DARK) Lab are leveraging NVIDIA NIM microservices in their new game-based benchmark suite, BALROG, designed to evaluate the agentic capabilities of models on challenging, long-horizon interactive tasks. By using NVIDIA NIM, the DARK lab was able to accelerate their benchmarking process, deploying and hosting the DeepSeek-R1 model without needing to do so locally. This showcases the flexibility and efficiency of NIM microservices, which can be deployed across various platforms, including cloud environments, data centers, and local workstations, enabling seamless integration into diverse workflows and handling workloads of any size.

Recommended read:
References :
  • techstrong.ai: NVIDIA Releases NeMo Microservices Bringing AI Agents to Enterprises
  • www.nextplatform.com: Nvidia NeMo Microservices For AI Agents Hits The Market
  • blogs.nvidia.com: Enterprises Onboard AI Teammates Faster With NVIDIA NeMo Tools to Scale Employee Productivity
  • developer.nvidia.com: Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices
  • NVIDIA Technical Blog: Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices
  • The Next Platform: Nvidia NeMo Microservices For AI Agents Hits The Market
  • The Register - Software: This article discusses NVIDIA's NeMo microservices, software tools for building AI agents, and their role in enterprise workflows.
  • MarkTechPost: Long-Context Multimodal Understanding No Longer Requires Massive Models: NVIDIA AI Introduces Eagle 2.5, a Generalist Vision-Language Model that Matches GPT-4o on Video Tasks Using Just 8B Parameters
  • NVIDIA Newsroom: Enterprises Onboard AI Teammates Faster With NVIDIA NeMo Tools to Scale Employee Productivity
  • cloudnativenow.com: NVIDIA Makes Microservices Framework for AI Apps Generally Available
  • www.tomshardware.com: Nvidia introduces G-Assist plug-in builder, allowing its AI to integrate with LLMs and software
  • techstrong.ai: NeMo microservices, the software tools behind AI agents for enterprises, are now available in general availability, working with partner platforms to offer prompt tuning, supervised fine-tuning, and knowledge retrieval.