News from the AI & ML world
@www.marktechpost.com
//
Nvidia is reportedly developing a new AI chip, the B30, specifically tailored for the Chinese market to comply with U.S. export controls. This Blackwell-based alternative aims to offer multi-GPU scaling capabilities, potentially through NVLink or ConnectX-8 SuperNICs. While earlier reports suggested different names like RTX Pro 6000D or B40, B30 could be one variant within the BXX family. The design incorporates GB20X silicon, which also powers consumer-grade RTX 50 GPUs, but may exclude NVLink support seen in prior generations due to its absence in consumer-grade GPU dies.
Nvidia has also introduced Fast-dLLM, a training-free framework designed to enhance the inference speed of diffusion large language models (LLMs). Diffusion models, explored as an alternative to autoregressive models, promise faster decoding through simultaneous multi-token generation, enabled by bidirectional attention mechanisms. However, their practical application is limited by inefficient inference, largely due to the lack of key-value (KV) caching, which accelerates performance by reusing previously computed attention states. Fast-dLLM aims to address this by bringing KV caching and parallel decoding capabilities to diffusion LLMs, potentially surpassing autoregressive systems.
During his keynote speech at GTC 2025, Nvidia CEO Jensen Huang emphasized the accelerating pace of artificial intelligence development and the critical need for optimized AI infrastructure. He stated Nvidia would shift to the Blackwell architecture for future China-bound chips, discontinuing Hopper-based alternatives following the H20 ban. Huang's focus on AI infrastructure highlights the industry's recognition of the importance of robust and scalable systems to support the growing demands of AI applications.
References :
- thenewstack.io: This article discusses Jensen Huang's keynote speech at GTC 2025, where he emphasized the acceleration of artificial intelligence development and outlined five key takeaways regarding optimizing AI infrastructure.
- MarkTechPost: This article discusses NVIDIA's Fast-dLLM, a training-free framework that brings KV caching and parallel decoding to diffusion LLMs. It aims to improve inference speed in diffusion models, potentially surpassing autoregressive systems.
- www.tomshardware.com: This article discusses the development of Nvidia's B30 AI chip specifically for the Chinese market. It highlights the potential inclusion of NVLink for multi-GPU scaling and the creation of high-performance clusters.
Classification: