News from the AI & ML world

DeeperML - #mlperf

@www.linkedin.com //

Nvidia Blackwell GPUs Dominate MLPerf Training

Nvidia has once again asserted its dominance in the AI training landscape with the release of the MLPerf Training v5.0 results. The company's Blackwell GB200 accelerators achieved record time-to-train scores, showcasing a significant leap in performance. This latest benchmark suite included submissions from various companies, but Nvidia's platform stood out, particularly in the most demanding large language model (LLM)-focused test involving Llama 3.1 405B pretraining. These results underscore the rapid growth and evolution of the AI field, with the Blackwell architecture demonstrably meeting the heightened performance demands of next-generation AI applications.

The MLPerf Training v5.0 results highlight Nvidia's commitment to versatility, as it was the only platform to submit results across every benchmark. The at-scale submissions leveraged two AI supercomputers powered by the Blackwell platform: Tyche, built using GB200 NVL72 rack-scale systems, and Nyx, based on DGX B200 systems. Additionally, Nvidia collaborated with CoreWeave and IBM, utilizing a cluster of 2,496 Blackwell GPUs and 1,248 Grace CPUs. The new Llama 3.1 405B pretraining benchmark witnessed Blackwell delivering 2.2x greater performance compared to the previous generation architecture at the same scale.

The performance gains are attributed to advancements in the Blackwell architecture, encompassing high-density liquid-cooled racks, 13.4TB of coherent memory per rack, and fifth-generation NVLink and NVLink Switch interconnect technologies for scale-up, as well as Quantum-2 InfiniBand networking for scale-out. These technological innovations, combined with the NVIDIA NeMo Framework software stack, are raising the bar for next-generation multimodal LLM training. While AMD did showcase generational performance gains, Nvidia's GPUs reigned supreme, outpacing AMD's MI325X in MLPerf benchmarks, solidifying Nvidia's position as a leader in AI training capabilities.

References :

NVIDIA Newsroom: NVIDIA Blackwell Delivers Breakthrough Performance in Latest MLPerf Training Results
MLCommons: New MLCommons MLPerf Training v5.0 Benchmark Results Reflect Rapid Growth and Evolution of the Field of AI
IEEE Spectrum: Nvidiaâ€™s Blackwell Conquers Largest LLM Training Benchmark
www.aiwire.net: Blackwell GPUs Lift Nvidia to the Top of MLPerf Training Rankings
www.servethehome.com: MLPerf Training v5.0 is Out
IEEE Spectrum: Ideal networking aids performance of the largest submissions to the LLM fine-tuning benchmarks, the system with the largest number of GPUs was submitted by Nvidia, a computer connecting 512 B200s.
ServeTheHome: The new MLPerf Training v5.0 are dominated by NVIDIA Blackwell and Hopper results, but we also get AMD Instinct MI325X on a benchmark as well

Classification:

HashTags: #Nvidia #MLPerf #Blackwell
Company: Nvidia
Target: AI Training
Product: Blackwell
Feature: GB200 accelerators
Type: Research
Severity: Major

@www.linkedin.com //

Nvidia Blackwell GPUs Dominate MLPerf Training Benchmarks

Nvidia's Blackwell GPUs have achieved top rankings in the latest MLPerf Training v5.0 benchmarks, demonstrating breakthrough performance across various AI workloads. The NVIDIA AI platform delivered the highest performance at scale on every benchmark, including the most challenging large language model (LLM) test, Llama 3.1 405B pretraining. Nvidia was the only vendor to submit results on all MLPerf Training v5.0 benchmarks, highlighting the versatility of the NVIDIA platform across a wide array of AI workloads, including LLMs, recommendation systems, multimodal LLMs, object detection, and graph neural networks.

The at-scale submissions used two AI supercomputers powered by the NVIDIA Blackwell platform: Tyche, built using NVIDIA GB200 NVL72 rack-scale systems, and Nyx, based on NVIDIA DGX B200 systems. Nvidia collaborated with CoreWeave and IBM to submit GB200 NVL72 results using a total of 2,496 Blackwell GPUs and 1,248 NVIDIA Grace CPUs. The GB200 NVL72 systems achieved 90% scaling efficiency up to 2,496 GPUs, improving time-to-convergence by up to 2.6x compared to Hopper-generation H100.

The new MLPerf Training v5.0 benchmark suite introduces a pretraining benchmark based on the Llama 3.1 405B generative AI system, the largest model to be introduced in the training benchmark suite. On this benchmark, Blackwell delivered 2.2x greater performance compared with the previous-generation architecture at the same scale. Furthermore, on the Llama 2 70B LoRA fine-tuning benchmark, NVIDIA DGX B200 systems, powered by eight Blackwell GPUs, delivered 2.5x more performance compared with a submission using the same number of GPUs in the prior round. These performance gains highlight advancements in the Blackwell architecture and software stack, including high-density liquid-cooled racks, fifth-generation NVLink and NVLink Switch interconnect technologies, and NVIDIA Quantum-2 InfiniBand networking.

References :

NVIDIA Newsroom: NVIDIA Blackwell Delivers Breakthrough Performance in Latest MLPerf Training Results
NVIDIA Technical Blog: NVIDIA Blackwell Delivers up to 2.6x Higher Performance in MLPerf Training v5.0
IEEE Spectrum: Nvidia’s Blackwell Conquers Largest LLM Training Benchmark
NVIDIA Technical Blog: Reproducing NVIDIA MLPerf v5.0 Training Scores for LLM Benchmarks
AI News | VentureBeat: Nvidia says its Blackwell chips lead benchmarks in training AI LLMs
blogs.nvidia.com: NVIDIA RTX Blackwell GPUs Accelerate Professional-Grade Video Editing
MLCommons: New MLCommons MLPerf Training v5.0 Benchmark Results Reflect Rapid Growth and Evolution of the Field of AI
www.aiwire.net: MLPerf Training v5.0 results show Nvidia’s Blackwell GB200 accelerators sprinting through record time-to-train scores.
blogs.nvidia.com: NVIDIA is working with companies worldwide to build out AI factories â€” speeding the training and deployment of next-generation AI applications that use the latest advancements in training and inference. The NVIDIA Blackwell architecture is built to meet the heightened performance requirements of these new applications. In the latest round of MLPerf Training â€” the
mlcommons.org: New MLCommons MLPerf Training v5.0 Benchmark Results Reflect Rapid Growth and Evolution of the Field of AI
NVIDIA Newsroom: NVIDIA RTX Blackwell GPUs Accelerate Professional-Grade Video Editing
ServeTheHome: The new MLPerf Training v5.0 are dominated by NVIDIA Blackwell and Hopper results, but we also get AMD Instinct MI325X on a benchmark as well
AIwire: This is a news article on nvidia Blackwell GPUs lift Nvidia to the top of MLPerf Training Rankings
IEEE Spectrum: Nvidiaâ€™s Blackwell Conquers Largest LLM Training Benchmark
www.servethehome.com: MLPerf Training v5.0 is Out

Classification:

HashTags: #MLPerf #NvidiaBlackwell #AITraining
Company: Nvidia
Target: AI Model Training
Attacker: Nvidia
Product: Blackwell GPUs
Feature: MLPerf Training 5.0
Malware: GB200
Type: AI
Severity: Informative

News from the AI & ML world

DeeperML - #mlperf

Nvidia Blackwell GPUs Dominate MLPerf Training

Classification:

Nvidia Blackwell GPUs Dominate MLPerf Training Benchmarks

Classification:

Benchmarks

Blogs

Research Tools