News from the AI & ML world

DeeperML

Aaron Klotz@tomshardware.com //
NVIDIA has recently announced a significant breakthrough in AI inference, achieving a new world record with its DGX B200 Blackwell node. This node, equipped with eight Blackwell GPUs, surpassed the 1,000 tokens per second (TPS) per user barrier while running Meta’s Llama 4 Maverick large language model. According to a report by Artificial Analysis, the DGX B200 node achieved 1,038 TPS/user, outperforming previous record holders like SambaNova, who achieved 792 TPS/user. This advancement showcases the immense capabilities of the Blackwell architecture and sets a new standard for AI performance.

NVIDIA achieved this record-breaking performance through extensive software optimizations, utilizing TensorRT and Eagle-3 techniques for speculative decoding. These optimizations resulted in a 4x performance uplift compared to Blackwell's prior best results. Further enhancements involved using FP8 data types, Attention operations, and the Mixture of Experts (MoE) AI technique. These improvements not only boosted speed but also maintained response accuracy. NVIDIA's Blackwell GPUs reached 72,000 TPS/server at their highest throughput configuration.

In addition to AI performance, NVIDIA is also revolutionizing AI data center infrastructure through a collaboration with Navitas Semiconductor. They are introducing a new 800V HVDC architecture designed to replace the aging 54V systems currently in use. This new architecture is expected to deliver up to 5% better power efficiency and 70% lower maintenance costs. The transition to 800V power enables a 45% reduction in copper wire thickness, significantly lowering material use and weight, while reducing heat and fewer losses.
Original img attribution: https://cdn.mos.cms.futurecdn.net/ssmE9PuBthYdYtduciLnxg.jpg
ImgSrc: cdn.mos.cms.fut

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • insideAI News: AI Inference: NVIDIA Reports Blackwell Surpasses 1000 TPS/User Barrier with Llama 4 Maverick
  • techvro.com: NVIDIA collaborates with Navitas to usher in a new era for AI data center infrastructure. Their joint 800V HVDC architecture replaces the...
  • www.tomshardware.com: Nvidia has broken another AI world record, breaking over 1,000 TPS/user with a DGX B200 node boasting eight Blackwell GPUs inside.
  • ServeTheHome: Nvidia has broken another AI world record, breaking over 1,000 TPS/user with a DGX B200 node boasting eight Blackwell GPUs inside.
Classification: