News from the AI & ML world

DeeperML - #gpu

NVIDIA RTX AI PCs and Cloud GPU Advancements - NVIDIA releases RTX AI PCs and Workstations, with CoreWeave offering NVIDIA RTX PRO 6000 Blackwell GPUs at scale, and AWS announcing EC2 P6e-GB200 UltraServers powered by NVIDIA Grace Blackwell GB200 superchips.

References: AWS News Blog , AIwire

NVIDIA is making significant strides in AI computing with the release of RTX AI PCs and Workstations, designed to accelerate coding assistant performance. These AI-powered copilots are fundamentally changing software development, providing real-time assistance to both experienced and novice developers. Coding assistants, optimized for RTX AI PCs, offer suggestions, explanations, and debugging capabilities, streamlining tasks and enhancing productivity across various projects, from academic endeavors to production code. These assistants can run locally, eliminating the latency and subscription costs associated with cloud-based alternatives.

CoreWeave has emerged as the first cloud platform to offer NVIDIA RTX PRO 6000 Blackwell Server Edition instances at scale. This advancement provides users with significantly improved performance, achieving up to 5.6x faster LLM inference and 3.5x faster text-to-video generation compared to previous generations. The RTX PRO 6000 is tailored for inference of models up to 70B parameters, providing a cost-efficient alternative to larger GPU clusters while maintaining strong performance for teams developing and scaling AI applications. CoreWeave now boasts one of the widest ranges of NVIDIA Blackwell infrastructure on the market, which also includes the NVIDIA GB200 NVL72 system and NVIDIA HGX B200 platform.

Amazon Web Services (AWS) has also announced the general availability of EC2 P6e-GB200 UltraServers, powered by NVIDIA Grace Blackwell GB200 superchips. These UltraServers deliver up to 72 GPUs with 360 petaflops of computing power, catering to AI training and inference at the trillion-parameter scale. The NVIDIA Grace Blackwell Superchips integrate two high-performance NVIDIA Blackwell tensor core GPUs and an NVIDIA Grace CPU, connected by the NVIDIA NVLink-C2C interconnect, boosting bandwidth between the GPU and CPU significantly. These UltraServers are deployed in EC2 UltraClusters, providing secure and reliable scalability to tens of thousands of GPUs, and are ideal for compute-intensive AI workloads such as training frontier models and building generative AI applications.

Recommended read:

Top link: AIwire
Permalink: More details

References :

AWS News Blog: Amazon announces the general availability of EC2 P6e-GB200 UltraServers, powered by NVIDIA Grace Blackwell GB200 superchips that enable up to 72 GPUs with 360 petaflops of computing power for AI training and inference at the trillion-parameter scale.
AIwire: CoreWeave, Inc. today announced it is the first cloud platform to make NVIDIA RTX PRO 6000 Blackwell Server Edition instances generally available.

Jim McGregor,@Tirias Research //

AMD Accelerates AI Data Centers With Instinct GPUs - AMD has launched its Instinct MI350 series data center graphics cards with up to 185 billion transistors, designed to accelerate AI data centers and compete with Nvidia’s Blackwell B200.

References: Tirias Research , AI ? SiliconANGLE , siliconangle.com ...

Advanced Micro Devices Inc. has launched its new AMD Instinct MI350 Series accelerators, designed to accelerate AI data centers and outperform Nvidia Corp.’s Blackwell B200 in specific tasks. The MI350 series includes the top-end MI355X, a liquid-cooled card, along with the MI350X which uses fans instead of liquid cooling. These new flagship data center graphics cards boast an impressive 185 billion transistors and are based on a three-dimensional, 10-chiplet design to enhance AI compute and inferencing capabilities.

The MI350 Series introduces significant performance improvements, achieving four times faster AI compute and 35 times faster inferencing compared to previous generations. These accelerators ship with 288 gigabytes of HBM3E memory, which features a three-dimensional design in which layers of circuits are stacked atop one another. According to AMD, the MI350 series features 60% more memory than Nvidia’s flagship Blackwell B200 graphics cards. Additionally, the MI350 chips can process 8-bit floating point numbers 10% faster and 4-bit floating point numbers more than twice as fast as the B200.

AMD is also rolling out its ROCm 7 software development platform for the Instinct accelerators and the Helios Rack AI platform. "With flexible air-cooled and direct liquid-cooled configurations, the Instinct MI350 Series is optimized for seamless deployment, supporting up to 64 GPUs in an air-cooled rack and up to 128 in a direct liquid-cooled and scaling up to 2.6 exaFLOPS of FP4 performance," stated Vamsi Boppana, the senior vice president of AMD’s Artificial Intelligence Group. The advancements aim to provide an open, scalable rack-scale AI infrastructure built on industry standards, setting the stage for transformative AI solutions across various industries.

Recommended read:

Top link: Tirias Research
Permalink: More details

References :

Tirias Research: AMD introduces the new Instant MI350 Series GPU accelerators, the ROCm 7 software development platform for the Instinct accelerators, and the Helios Rack AI platform.
AI ? SiliconANGLE: Advanced Micro Devices Inc. today introduced a new line of artificial intelligence chips that it says can outperform Nvidia Corp.â€™s Blackwell B200 at some tasks.
AI News | VentureBeat: AMD announced its new AMD Instinct MI350 Series accelerators, which are four times faster on AI compute and 35 times faster on inferencing.
siliconangle.com: AMD debuts new flagship MI350 data center graphics cards with 185B transistors
insidehpc.com: AMD Announces New GPUs, Development Platform, Rack Scale Architecture
www.eejournal.com: AMD Unveils Vision for an Open AI Ecosystem, Detailing New Silicon, Software and Systems at Advancing AI 2025
Quartz: AMD is going after Nvidia
insideAI News: AMD Announces New GPUs, Development Platform, Rack Scale Architecture
insidehpc.com: Vultr Cloud to Provide AI Workloads with AMD Instinct MI355X GPU
ServeTheHome: This is the AMD Instinct MI350
www.servethehome.com: This is the AMD Instinct MI350

@www.marktechpost.com //

NVIDIA Expands AI Compute and Physical AI Capabilities - NVIDIA announced DGX Cloud Lepton for GPU access across multi-cloud platforms, an AI Data Platform reference design with DDN, and is building AI supercomputers in Taiwan.

References: The Register - Software , insideAI News , MarkTechPost ...

NVIDIA has recently unveiled several key initiatives aimed at expanding its reach in the AI landscape, particularly in compute capabilities and physical AI applications. The company announced DGX Cloud Lepton, an AI platform with a compute marketplace that connects developers building agentic and physical AI applications with tens of thousands of GPUs from a network of cloud providers. This platform will offer NVIDIA Blackwell and other NVIDIA architecture GPUs, allowing developers to access GPU compute capacity in specific regions for both on-demand and long-term computing, supporting strategic and sovereign AI operational requirements. NVIDIA emphasizes that DGX Cloud Lepton unifies access to cloud AI services and GPU capacity across its compute ecosystem, integrating with its software stack to accelerate and simplify AI application development and deployment.

NVIDIA is also making significant investments in Taiwan, establishing AI supercomputers and an overseas headquarters near Taipei. In partnership with Foxconn, NVIDIA is working with the Taiwanese government to build an "AI factory." Furthermore, it disclosed an AI supercomputer for Taiwan's National Center for High-Performance Computing (NCHC) to replace the earlier Taiwania 2 system, which will utilize its GPU hardware. This new supercomputer, based on NVIDIA's HGX H200 platform with over 1,700 GPUs, aims to provide researchers with enhanced performance for AI workloads. Academic institutions, government agencies, and small businesses in Taiwan will have the opportunity to apply for access to this powerful resource, bolstering their projects.

In addition to hardware advancements, NVIDIA has introduced Cosmos-Reason1, a suite of AI models designed to advance physical common sense and embodied reasoning in real-world environments. This suite aims to address the current limitations of AI models in understanding and interacting with the physical world. Moreover, NVIDIA unveiled a new AI Blueprint for Video Search and Summarization (VSS), empowering developers to build AI agents that can analyze video streams for various applications, from manufacturing to smart cities. For instance, Pegatron, an electronics manufacturing firm, reported significant reductions in labor costs and defect rates by utilizing AI agents built with AI Blueprint for VSS. The VSS platform leverages NVIDIA's language models and connects to enterprise data to provide accurate and efficient video analysis.

Recommended read:

Top link: www.marktechpost.com
Permalink: More details

References :

The Register - Software: Nvidia sets up shop in Taiwan with AI supers and a factory full of ambition
insideAI News: NVIDIA Announces DGX Cloud Lepton for GPU Access across Multi-Cloud Platforms
eWEEK: NVIDIAâ€™s New AI Video Search and Summarization: How It Can Help With Manufacturing, Training, and More
MarkTechPost: NVIDIA Releases Cosmos-Reason1: A Suite of AI Models Advancing Physical Common Sense and Embodied Reasoning in Real-World Environments

@developer.nvidia.com //

NVIDIA Accelerates Battery Research, Complying with Export Rules - NVIDIA is developing specialized LLMs for battery research, creating a cut-down HGX H20 GPU for China, and introducing Audio-SDS, a unified framework for prompt-guided audio synthesis and source separation.

References: developer.nvidia.com , www.tomshardware.com ,

NVIDIA is making strides in accelerating scientific research and adapting to changing global regulations. The company is focusing on battery innovation through the development of specialized Large Language Models (LLMs) with advanced reasoning capabilities. These models, exemplified by SES AI's Molecular Universe LLM, a 70B parameter model, are designed to overcome the limitations of general-purpose LLMs by incorporating domain-specific knowledge and terminology. This approach significantly enhances performance in specialized fields, enabling tasks such as hypothesis generation, chain-of-thought reasoning, and self-correction, which are critical for driving material exploration and boosting expert productivity.

NVIDIA is also navigating export control rules by preparing a cut-down version of its HGX H20 AI processor for the Chinese market. This strategic move aims to maintain access to this crucial market while adhering to updated U.S. export regulations that effectively barred the original version. The downgraded AI GPU will feature reduced HBM memory capacity to comply with the newly imposed technical limits. This adjustment ensures that NVIDIA remains within the permissible thresholds set by the U.S. government, reflecting the company's commitment to complying with international trade laws while continuing to serve its global customer base.

In addition to its work on battery research and regulatory compliance, NVIDIA has introduced Audio-SDS, a unified diffusion-based framework for prompt-guided audio synthesis and source separation. This innovative framework leverages a single pretrained model to perform various audio tasks without requiring specialized datasets. By adapting Score Distillation Sampling (SDS) to audio diffusion, NVIDIA is enabling the optimization of parametric audio representations, uniting signal-processing interpretability with the flexibility of modern diffusion-based generation. This technology promises to advance audio synthesis and source separation by integrating data-driven priors with explicit parameter control, producing perceptually compelling results.

Recommended read:

Top link: developer.nvidia.com
Permalink: More details

References :

developer.nvidia.com: Scientific research in complex fields like battery innovation is often slowed by manual evaluation of materials, limiting progress to just dozens of candidates...
www.tomshardware.com: Nvidia plans to launch a downgraded HGX H20 AI processor with reduced HBM memory capacity for China by July to comply with new U.S. export rules, if a new rumor is correct.
www.marktechpost.com: Audio diffusion models have achieved high-quality speech, music, and Foley sound synthesis, yet they predominantly excel at sample generation rather than parameter optimization.

Alex Woodie@BigDATAwire //

OpenSearch 3.0 Boosts Performance with GPU Acceleration - OpenSearch 3.0, released under the Linux Foundation, features GPU-accelerated vector indexing for improved AI application performance and battles ElasticSearch for market share.

References: Blocks and Files , DEVCLASS , Big Data ? SiliconANGLE ...

OpenSearch 3.0 has been released by the OpenSearch Software Foundation, marking its first major release under the Linux Foundation. This new version aims to compete with Elasticsearch by providing significant performance improvements, particularly in AI workloads. Organizations that adopt OpenSearch 3.0 for big data search, analytics, and AI are expected to see a 9.5x performance increase compared to previous versions. The release includes GPU acceleration that cuts costs and is designed to handle billions of vectors for AI applications like generative AI chatbots and retrieval-augmented generation, demonstrating its capabilities as a fully functional, scalable vector database.

The key highlight of OpenSearch 3.0 is the GPU-accelerated vector search, an experimental feature that leverages NVIDIA GPUs via the cuVS library to enhance the OpenSearch Vector Engine. This addition results in a 9.3x boost in the performance of vector database workloads and reduces costs by 3.75x compared to CPU-only solutions. By using GPU acceleration for computationally intensive vector operations, OpenSearch 3.0 dramatically reduces index build times and speeds up data-intensive workloads. The platform also supports Model Context Protocol (MCP), enabling AI agents to communicate effectively with OpenSearch.

Furthermore, OpenSearch 3.0 incorporates several other enhancements including the gRPC protocol for data transport, pull-based ingestion for more efficient data streaming from sources like Apache Kafka, and uses Apache Lucene 10 for indexing and search. Core upgrades to the Java code, now requiring Java 21, include the removal of legacy code and adoption of the Java platform module system, aiming to refactor the existing monolithic server module into libraries. With these upgrades, OpenSearch 3.0 targets fast AI search, scalable vector database operations and improved overall performance compared to its predecessors and competitors.

Recommended read:

Top link: BigDATAwire
Permalink: More details

References :

Blocks and Files: OpenSearch v3
DEVCLASS: OpenSearch 3.0 hits: First major release under Linux Foundation as it battles ElasticSearch for mindshare
OpenSearch Project: ğŸš€ BIG NEWS: OpenSearch 3.0 is now available! With 9.5x performance improvement over v1.3 and GPU acceleration that cuts costs by 3.75x, it's ready to handle billions of vectors for your AI applications. Check out the major upgrades to our open-source vector database!
Big Data ? SiliconANGLE: OpenSearch revs up AI workloads with GPU-accelerated vector search
BigDATAwire: OpenSearch Gets Parallel Performance Boost Thanks to GPUs
siliconangle.com: OpenSearch revs up AI workloads with GPU-accelerated vector search
www.bigdatawire.com: OpenSearch Gets Parallel Performance Boost Thanks to GPUs
www.prnewswire.com: OpenSearch 3.0 enhances vector database performance, search infrastructure and scalability to meet AI-driven demand

Ian Buck@NVIDIA Blog //

NVIDIA Grace Blackwell GPUs Live on CoreWeave - NVIDIA announces Grace Blackwell GPUs and their integration on CoreWeave, offering pioneers in AI development a scalable solution for model training and deployment, boosting applications requiring significant processing power.

References: NVIDIA Newsroom , insideAI News , www.networkworld.com ...

NVIDIA and CoreWeave have strengthened their partnership, bringing NVIDIA's Grace Blackwell GB200 NVL72 systems online at scale for CoreWeave customers. This collaboration aims to empower AI pioneers with cutting-edge technology for developing and deploying next-generation AI models and applications. CoreWeave is among the first cloud providers to offer this solution, demonstrating its commitment to providing the latest and most powerful AI infrastructure. The relationship between NVIDIA and CoreWeave is well-established, with NVIDIA having invested heavily in CoreWeave.

CoreWeave's cloud services are optimized for the GB200 NVL72, encompassing its Kubernetes Service, Slurm on Kubernetes (SUNK), Mission Control, and more. These Blackwell instances can scale up to 110,000 Blackwell GPUs using NVIDIA Quantum-2 InfiniBand networking. Leading AI companies like Cohere, IBM, and Mistral AI are already leveraging these systems to train and run complex AI models. CoreWeave CEO, Mike Intrator stated that CoreWeave works closely with NVIDIA to quickly deliver the most powerful solutions for training AI models and serving inference.

Several AI companies are experiencing significant performance boosts using the new Grace Blackwell systems. Cohere is utilizing the Grace Blackwell Superchips to develop secure enterprise AI applications, reporting up to three times more performance in training for 100 billion-parameter models compared to previous-generation NVIDIA Hopper GPUs. IBM is scaling its deployment to thousands of Blackwell GPUs on CoreWeave to train its Granite open-source AI models for IBM watsonx Orchestrate. Mistral AI is using their first thousand Blackwell GPUs to build the next generation of open-source AI models, noting a 2x improvement in performance for dense model training.

Recommended read:

Top link: NVIDIA Blog
Permalink: More details

References :

NVIDIA Newsroom: Thousands of NVIDIA Grace Blackwell GPUs Now Live at CoreWeave, Propelling Development for AI Pioneers
insideAI News: NVIDIA announced that GPU cloud platform CoreWeave is among the first cloud providers to bring NVIDIA GB200 NVL72 systems online at scale, with Cohere, IBM and Mistral AI using them for model training and deployment.
insidehpc.com: NVIDIA: Grace Blackwell GPUs on CoreWeave
www.networkworld.com: Highlights CoreWeave's offering of cloud-based Grace Blackwell GPUs for intensive AI training.

News from the AI & ML world

DeeperML - #gpu

NVIDIA RTX AI PCs and Cloud GPU Advancements - NVIDIA releases RTX AI PCs and Workstations, with CoreWeave offering NVIDIA RTX PRO 6000 Blackwell GPUs at scale, and AWS announcing EC2 P6e-GB200 UltraServers powered by NVIDIA Grace Blackwell GB200 superchips.

AMD Accelerates AI Data Centers With Instinct GPUs - AMD has launched its Instinct MI350 series data center graphics cards with up to 185 billion transistors, designed to accelerate AI data centers and compete with Nvidia’s Blackwell B200.

NVIDIA Expands AI Compute and Physical AI Capabilities - NVIDIA announced DGX Cloud Lepton for GPU access across multi-cloud platforms, an AI Data Platform reference design with DDN, and is building AI supercomputers in Taiwan.

NVIDIA Accelerates Battery Research, Complying with Export Rules - NVIDIA is developing specialized LLMs for battery research, creating a cut-down HGX H20 GPU for China, and introducing Audio-SDS, a unified framework for prompt-guided audio synthesis and source separation.

OpenSearch 3.0 Boosts Performance with GPU Acceleration - OpenSearch 3.0, released under the Linux Foundation, features GPU-accelerated vector indexing for improved AI application performance and battles ElasticSearch for market share.

NVIDIA Grace Blackwell GPUs Live on CoreWeave - NVIDIA announces Grace Blackwell GPUs and their integration on CoreWeave, offering pioneers in AI development a scalable solution for model training and deployment, boosting applications requiring significant processing power.

Benchmarks

Blogs

Research Tools