Andrew Jolly@AIwire
//
References:
AWS News Blog
, AIwire
NVIDIA is making significant strides in AI computing with the release of RTX AI PCs and Workstations, designed to accelerate coding assistant performance. These AI-powered copilots are fundamentally changing software development, providing real-time assistance to both experienced and novice developers. Coding assistants, optimized for RTX AI PCs, offer suggestions, explanations, and debugging capabilities, streamlining tasks and enhancing productivity across various projects, from academic endeavors to production code. These assistants can run locally, eliminating the latency and subscription costs associated with cloud-based alternatives.
CoreWeave has emerged as the first cloud platform to offer NVIDIA RTX PRO 6000 Blackwell Server Edition instances at scale. This advancement provides users with significantly improved performance, achieving up to 5.6x faster LLM inference and 3.5x faster text-to-video generation compared to previous generations. The RTX PRO 6000 is tailored for inference of models up to 70B parameters, providing a cost-efficient alternative to larger GPU clusters while maintaining strong performance for teams developing and scaling AI applications. CoreWeave now boasts one of the widest ranges of NVIDIA Blackwell infrastructure on the market, which also includes the NVIDIA GB200 NVL72 system and NVIDIA HGX B200 platform. Amazon Web Services (AWS) has also announced the general availability of EC2 P6e-GB200 UltraServers, powered by NVIDIA Grace Blackwell GB200 superchips. These UltraServers deliver up to 72 GPUs with 360 petaflops of computing power, catering to AI training and inference at the trillion-parameter scale. The NVIDIA Grace Blackwell Superchips integrate two high-performance NVIDIA Blackwell tensor core GPUs and an NVIDIA Grace CPU, connected by the NVIDIA NVLink-C2C interconnect, boosting bandwidth between the GPU and CPU significantly. These UltraServers are deployed in EC2 UltraClusters, providing secure and reliable scalability to tens of thousands of GPUs, and are ideal for compute-intensive AI workloads such as training frontier models and building generative AI applications. Recommended read:
References :
Jim McGregor,@Tirias Research
//
Advanced Micro Devices Inc. has launched its new AMD Instinct MI350 Series accelerators, designed to accelerate AI data centers and outperform Nvidia Corp.’s Blackwell B200 in specific tasks. The MI350 series includes the top-end MI355X, a liquid-cooled card, along with the MI350X which uses fans instead of liquid cooling. These new flagship data center graphics cards boast an impressive 185 billion transistors and are based on a three-dimensional, 10-chiplet design to enhance AI compute and inferencing capabilities.
The MI350 Series introduces significant performance improvements, achieving four times faster AI compute and 35 times faster inferencing compared to previous generations. These accelerators ship with 288 gigabytes of HBM3E memory, which features a three-dimensional design in which layers of circuits are stacked atop one another. According to AMD, the MI350 series features 60% more memory than Nvidia’s flagship Blackwell B200 graphics cards. Additionally, the MI350 chips can process 8-bit floating point numbers 10% faster and 4-bit floating point numbers more than twice as fast as the B200. AMD is also rolling out its ROCm 7 software development platform for the Instinct accelerators and the Helios Rack AI platform. "With flexible air-cooled and direct liquid-cooled configurations, the Instinct MI350 Series is optimized for seamless deployment, supporting up to 64 GPUs in an air-cooled rack and up to 128 in a direct liquid-cooled and scaling up to 2.6 exaFLOPS of FP4 performance," stated Vamsi Boppana, the senior vice president of AMD’s Artificial Intelligence Group. The advancements aim to provide an open, scalable rack-scale AI infrastructure built on industry standards, setting the stage for transformative AI solutions across various industries. Recommended read:
References :
@www.marktechpost.com
//
NVIDIA has recently unveiled several key initiatives aimed at expanding its reach in the AI landscape, particularly in compute capabilities and physical AI applications. The company announced DGX Cloud Lepton, an AI platform with a compute marketplace that connects developers building agentic and physical AI applications with tens of thousands of GPUs from a network of cloud providers. This platform will offer NVIDIA Blackwell and other NVIDIA architecture GPUs, allowing developers to access GPU compute capacity in specific regions for both on-demand and long-term computing, supporting strategic and sovereign AI operational requirements. NVIDIA emphasizes that DGX Cloud Lepton unifies access to cloud AI services and GPU capacity across its compute ecosystem, integrating with its software stack to accelerate and simplify AI application development and deployment.
NVIDIA is also making significant investments in Taiwan, establishing AI supercomputers and an overseas headquarters near Taipei. In partnership with Foxconn, NVIDIA is working with the Taiwanese government to build an "AI factory." Furthermore, it disclosed an AI supercomputer for Taiwan's National Center for High-Performance Computing (NCHC) to replace the earlier Taiwania 2 system, which will utilize its GPU hardware. This new supercomputer, based on NVIDIA's HGX H200 platform with over 1,700 GPUs, aims to provide researchers with enhanced performance for AI workloads. Academic institutions, government agencies, and small businesses in Taiwan will have the opportunity to apply for access to this powerful resource, bolstering their projects. In addition to hardware advancements, NVIDIA has introduced Cosmos-Reason1, a suite of AI models designed to advance physical common sense and embodied reasoning in real-world environments. This suite aims to address the current limitations of AI models in understanding and interacting with the physical world. Moreover, NVIDIA unveiled a new AI Blueprint for Video Search and Summarization (VSS), empowering developers to build AI agents that can analyze video streams for various applications, from manufacturing to smart cities. For instance, Pegatron, an electronics manufacturing firm, reported significant reductions in labor costs and defect rates by utilizing AI agents built with AI Blueprint for VSS. The VSS platform leverages NVIDIA's language models and connects to enterprise data to provide accurate and efficient video analysis. Recommended read:
References :
@developer.nvidia.com
//
References:
developer.nvidia.com
, www.tomshardware.com
,
NVIDIA is making strides in accelerating scientific research and adapting to changing global regulations. The company is focusing on battery innovation through the development of specialized Large Language Models (LLMs) with advanced reasoning capabilities. These models, exemplified by SES AI's Molecular Universe LLM, a 70B parameter model, are designed to overcome the limitations of general-purpose LLMs by incorporating domain-specific knowledge and terminology. This approach significantly enhances performance in specialized fields, enabling tasks such as hypothesis generation, chain-of-thought reasoning, and self-correction, which are critical for driving material exploration and boosting expert productivity.
NVIDIA is also navigating export control rules by preparing a cut-down version of its HGX H20 AI processor for the Chinese market. This strategic move aims to maintain access to this crucial market while adhering to updated U.S. export regulations that effectively barred the original version. The downgraded AI GPU will feature reduced HBM memory capacity to comply with the newly imposed technical limits. This adjustment ensures that NVIDIA remains within the permissible thresholds set by the U.S. government, reflecting the company's commitment to complying with international trade laws while continuing to serve its global customer base. In addition to its work on battery research and regulatory compliance, NVIDIA has introduced Audio-SDS, a unified diffusion-based framework for prompt-guided audio synthesis and source separation. This innovative framework leverages a single pretrained model to perform various audio tasks without requiring specialized datasets. By adapting Score Distillation Sampling (SDS) to audio diffusion, NVIDIA is enabling the optimization of parametric audio representations, uniting signal-processing interpretability with the flexibility of modern diffusion-based generation. This technology promises to advance audio synthesis and source separation by integrating data-driven priors with explicit parameter control, producing perceptually compelling results. Recommended read:
References :
Alex Woodie@BigDATAwire
//
OpenSearch 3.0 has been released by the OpenSearch Software Foundation, marking its first major release under the Linux Foundation. This new version aims to compete with Elasticsearch by providing significant performance improvements, particularly in AI workloads. Organizations that adopt OpenSearch 3.0 for big data search, analytics, and AI are expected to see a 9.5x performance increase compared to previous versions. The release includes GPU acceleration that cuts costs and is designed to handle billions of vectors for AI applications like generative AI chatbots and retrieval-augmented generation, demonstrating its capabilities as a fully functional, scalable vector database.
The key highlight of OpenSearch 3.0 is the GPU-accelerated vector search, an experimental feature that leverages NVIDIA GPUs via the cuVS library to enhance the OpenSearch Vector Engine. This addition results in a 9.3x boost in the performance of vector database workloads and reduces costs by 3.75x compared to CPU-only solutions. By using GPU acceleration for computationally intensive vector operations, OpenSearch 3.0 dramatically reduces index build times and speeds up data-intensive workloads. The platform also supports Model Context Protocol (MCP), enabling AI agents to communicate effectively with OpenSearch. Furthermore, OpenSearch 3.0 incorporates several other enhancements including the gRPC protocol for data transport, pull-based ingestion for more efficient data streaming from sources like Apache Kafka, and uses Apache Lucene 10 for indexing and search. Core upgrades to the Java code, now requiring Java 21, include the removal of legacy code and adoption of the Java platform module system, aiming to refactor the existing monolithic server module into libraries. With these upgrades, OpenSearch 3.0 targets fast AI search, scalable vector database operations and improved overall performance compared to its predecessors and competitors. Recommended read:
References :
Ian Buck@NVIDIA Blog
//
NVIDIA and CoreWeave have strengthened their partnership, bringing NVIDIA's Grace Blackwell GB200 NVL72 systems online at scale for CoreWeave customers. This collaboration aims to empower AI pioneers with cutting-edge technology for developing and deploying next-generation AI models and applications. CoreWeave is among the first cloud providers to offer this solution, demonstrating its commitment to providing the latest and most powerful AI infrastructure. The relationship between NVIDIA and CoreWeave is well-established, with NVIDIA having invested heavily in CoreWeave.
CoreWeave's cloud services are optimized for the GB200 NVL72, encompassing its Kubernetes Service, Slurm on Kubernetes (SUNK), Mission Control, and more. These Blackwell instances can scale up to 110,000 Blackwell GPUs using NVIDIA Quantum-2 InfiniBand networking. Leading AI companies like Cohere, IBM, and Mistral AI are already leveraging these systems to train and run complex AI models. CoreWeave CEO, Mike Intrator stated that CoreWeave works closely with NVIDIA to quickly deliver the most powerful solutions for training AI models and serving inference. Several AI companies are experiencing significant performance boosts using the new Grace Blackwell systems. Cohere is utilizing the Grace Blackwell Superchips to develop secure enterprise AI applications, reporting up to three times more performance in training for 100 billion-parameter models compared to previous-generation NVIDIA Hopper GPUs. IBM is scaling its deployment to thousands of Blackwell GPUs on CoreWeave to train its Granite open-source AI models for IBM watsonx Orchestrate. Mistral AI is using their first thousand Blackwell GPUs to build the next generation of open-source AI models, noting a 2x improvement in performance for dense model training. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |