News from the AI & ML world

DeeperML - #aiinfrastructure

Brian Wang@NextBigFuture.com //
xAI's latest artificial intelligence model, Grok 4, has been unveiled, showcasing significant advancements according to leaked benchmarks. Reports indicate Grok 4 achieved a score of 45% on the Humanity Last Exam when reasoning is applied, a substantial leap that suggests the model could potentially surpass current industry leaders. This development highlights the rapidly intensifying competition within the AI sector and generates considerable excitement among AI enthusiasts and researchers who are anticipating the official release and further performance evaluations.

The release of Grok 4 follows recent controversies surrounding earlier versions of the chatbot, which exhibited problematic behavior, including the dissemination of antisemitic remarks and conspiracy theories. Elon Musk's xAI has issued apologies for these incidents, stating that a recent code update contributed to the offensive outputs. The company has committed to addressing these issues, including making system prompts public to ensure greater transparency and prevent future misconduct. Despite these past challenges, the focus now shifts to Grok 4's promised enhanced capabilities and its potential to set new standards in AI performance.

Alongside the base Grok 4 model, xAI has also introduced Grok 4 Heavy, a multi-agent system reportedly capable of achieving a 50% score on the Humanity Last Exam. The company has also announced new subscription plans, including a $300 per month option for the "SuperGrok Heavy" tier. These tiered offerings suggest a strategy to cater to different user needs, from general consumers to power users and developers. The integration of new connectors for platforms like Notion, Slack, and Gmail is also planned, aiming to broaden Grok's utility and seamless integration into users' workflows.

Recommended read:
References :
  • NextBigFuture.com: XAI Grok 4 Benchmarks are showing it is the leading model. Humanity Last Exam at 35 and 45 for reasoning is a big improvement from about 21 for other top models. If these leaked Grok 4 benchmarks are correct, 95 AIME, 88 GPQA, 75 SWE-bench, then XAI has the most powerful model on the market. ...
  • TestingCatalog: Grok 4 will be SOTA, according to the leaked benchmarks; 35% on HLE, 45% with reasoning; 87-88% on GPQA; 72-75% on SWE Bench (for Grok 4 Code)
  • felloai.com: Elon Musk’s Grok 4 AI Just Leaked, and It’s Crushing All the Competitors
  • Fello AI: Elon Musk’s Grok 4 AI Just Leaked, and It’s Crushing All the Competitors
  • techxplore.com: Musk's AI company scrubs inappropriate posts after Grok chatbot makes antisemitic comments
  • NextBigFuture.com: XAI Grok 4 Releases Wednesday July 9 at 8pm PST
  • www.theguardian.com: Musk’s AI firm forced to delete posts praising Hitler from Grok chatbot
  • felloai.com: xAI Just Introduced Grok 4: Elon Musk’s AI Breaks Benchmarks and Beats Other LLMs
  • Fello AI: xAI Just Introduced Grok 4: Elon Musk’s AI Breaks Benchmarks and Beats Other LLMs
  • thezvi.substack.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
  • thezvi.wordpress.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
  • TestingCatalog: xAI plans expanded model lineup and Grok 4 set for July 9 debut.
  • TestingCatalog: xAI released Grok 4 and Grok 4 Heavy along with a new 300$ subscription plan. Grok 4 Heavy is a multi-agent system which is able to achieve a 50% score on the HLE benchmark.
  • www.rdworldonline.com: xAI releases Grok 4, claiming Ph.D.-level smarts across all fields
  • thezvi.wordpress.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
  • NextBigFuture.com: Theo-gg who has been critical of XAI in the past, confirms that XAi Grok 4 is the top model.
  • TestingCatalog: New xAI connector will bring Notion support to Grok alongside Slack and Gmail
  • Interconnects: xAI's Grok 4: The tension of frontier performance with a side of Elon favoritism
  • NextBigFuture.com: XAI Grok 4 Revolution: AI Breakthroughs, Tesla’s Future, and Economic Shifts
  • www.tomsguide.com: Grok 4 is here — Elon Musk says it's the same model physicists use
  • Latest news: Musk claims new Grok 4 beats o3 and Gemini 2.5 Pro - how to try it

Jowi Morales@tomshardware.com //
NVIDIA is partnering with Germany and Deutsche Telekom to build Europe's first industrial AI cloud, a project hailed as one of the most ambitious tech endeavors in the continent. This initiative aims to establish Germany as a leader in AI manufacturing and innovation. NVIDIA's CEO, Jensen Huang, met with Chancellor Friedrich Merz to discuss the new partnerships that will drive breakthroughs on this AI cloud.

This "AI factory," located in Germany, will provide European industrial leaders with the computational power needed to revolutionize manufacturing processes, from design and engineering to simulation and robotics. The goal is to empower European industrial players to lead in simulation-first, AI-driven manufacturing. Deutsche Telekom's CEO, Timotheus Höttges, emphasized the urgency of seizing AI opportunities to revolutionize the industry and secure a leading position in global technology competition.

The first phase of the project will involve deploying 10,000 NVIDIA Blackwell GPUs across various high-performance systems, making it Germany's largest AI deployment. This infrastructure will also feature NVIDIA networking and AI software. NEURA Robotics, a German firm specializing in cognitive robotics, plans to utilize these resources to power its Neuraverse, a network where robots can learn from each other. This partnership between NVIDIA and Germany signifies a critical step towards achieving technological sovereignty in Europe and accelerating AI development across industries.

Recommended read:
References :
  • NVIDIA Newsroom: NVIDIA and Deutsche Telekom Partner to Advance Germany’s Sovereign AI
  • www.artificialintelligence-news.com: NVIDIA helps Germany lead Europe’s AI manufacturing race
  • www.tomshardware.com: Nvidia is building the 'world's first' industrial AI cloud—German facility to leverage 10,000 GPUs, DGX B200, and RTX Pro servers
  • AI News: NVIDIA helps Germany lead Europe’s AI manufacturing race
  • blogs.nvidia.com: NVIDIA and Deutsche Telekom Partner to Advance Germany’s Sovereign AI
  • MSSP feed for Latest: CrowdStrike and Nvidia Add LLM Security, Offer New Service for MSSPs
  • www.verdict.co.uk: Nvidia to develop industrial AI cloud for manufacturers in Europe
  • Verdict: Nvidia to develop industrial AI cloud for manufacturers in Europe
  • insideAI News: AMD Announces New GPUs, Development Platform, Rack Scale Architecture
  • insidehpc.com: AMD Announces New GPUs, Development Platform, Rack Scale Architecture
  • www.itpro.com: Nvidia, Deutsche Telekom team up for "sovereign" industrial AI cloud

@techinformed.com //
NVIDIA CEO Jensen Huang and UK Prime Minister Keir Starmer have recently joined forces at London Tech Week to solidify the UK's position as a leading force in AI. This collaborative effort aims to bolster the nation's digital infrastructure and promote AI development across various sectors. Starmer has committed £1 billion in investment to supercharge the AI sector, emphasizing the UK's ambition to be at the forefront of AI innovation rather than simply consuming the technology. Huang highlighted the UK's rich AI community, world-class universities, and significant AI capital investment as key factors positioning it for success.

The partnership includes the establishment of a dedicated NVIDIA AI Technology Center in the UK. This center will provide hands-on training in crucial areas such as AI, data science, and accelerated computing, with a specific focus on nurturing talent in foundation model building, embodied AI, materials science, and earth systems modeling. The initiative aims to tackle the existing AI skills gap, ensuring that the UK has a workforce capable of leveraging the new infrastructure and technologies being developed. Cloud providers are also stepping up with significant GPU deployments, with Nscale planning 10,000 NVIDIA Blackwell GPUs by late 2026 and Nebius revealing plans for an AI factory with 4,000 NVIDIA Blackwell GPUs.

Furthermore, the UK's financial sector is set to benefit from this AI push. The Financial Conduct Authority (FCA) is launching a 'supercharged sandbox' scheme, allowing banks and other City firms to experiment safely with NVIDIA AI products. This initiative aims to speed up innovation and boost UK growth by integrating AI into the financial sector. Potential applications include intercepting authorized push payment fraud and identifying stock market manipulation, showcasing the potential of AI to enhance customer service and data analytics within the financial industry.

Recommended read:
References :

@www.marktechpost.com //
Nvidia is reportedly developing a new AI chip, the B30, specifically tailored for the Chinese market to comply with U.S. export controls. This Blackwell-based alternative aims to offer multi-GPU scaling capabilities, potentially through NVLink or ConnectX-8 SuperNICs. While earlier reports suggested different names like RTX Pro 6000D or B40, B30 could be one variant within the BXX family. The design incorporates GB20X silicon, which also powers consumer-grade RTX 50 GPUs, but may exclude NVLink support seen in prior generations due to its absence in consumer-grade GPU dies.

Nvidia has also introduced Fast-dLLM, a training-free framework designed to enhance the inference speed of diffusion large language models (LLMs). Diffusion models, explored as an alternative to autoregressive models, promise faster decoding through simultaneous multi-token generation, enabled by bidirectional attention mechanisms. However, their practical application is limited by inefficient inference, largely due to the lack of key-value (KV) caching, which accelerates performance by reusing previously computed attention states. Fast-dLLM aims to address this by bringing KV caching and parallel decoding capabilities to diffusion LLMs, potentially surpassing autoregressive systems.

During his keynote speech at GTC 2025, Nvidia CEO Jensen Huang emphasized the accelerating pace of artificial intelligence development and the critical need for optimized AI infrastructure. He stated Nvidia would shift to the Blackwell architecture for future China-bound chips, discontinuing Hopper-based alternatives following the H20 ban. Huang's focus on AI infrastructure highlights the industry's recognition of the importance of robust and scalable systems to support the growing demands of AI applications.

Recommended read:
References :
  • thenewstack.io: This article discusses Jensen Huang's keynote speech at GTC 2025, where he emphasized the acceleration of artificial intelligence development and outlined five key takeaways regarding optimizing AI infrastructure.
  • MarkTechPost: This article discusses NVIDIA's Fast-dLLM, a training-free framework that brings KV caching and parallel decoding to diffusion LLMs. It aims to improve inference speed in diffusion models, potentially surpassing autoregressive systems.
  • www.tomshardware.com: This article discusses the development of Nvidia's B30 AI chip specifically for the Chinese market. It highlights the potential inclusion of NVLink for multi-GPU scaling and the creation of high-performance clusters.
  • www.marktechpost.com: NVIDIA has introduced Llama Nemotron Nano VL, a vision-language model (VLM) designed to address document-level understanding tasks with efficiency and precision.

@futurumgroup.com //
NVIDIA reported a significant jump in Q1 FY 2026 revenue, increasing by 69% year-over-year to $44.1 billion. This growth was fueled by strong demand in both the data center and gaming segments, driven by the anticipation and initial deployments of their Blackwell architecture. Despite facing export restrictions in China related to H20, NVIDIA’s performance reflects sustained global demand for AI computing. The company is actively scaling Blackwell deployments while navigating these export-related challenges, supported by traction in sovereign AI initiatives which helped offset the headwinds in China.

NVIDIA's CEO, Jensen Huang, highlighted the full-scale production of the Blackwell NVL72 AI supercomputer, describing it as a "thinking machine" for reasoning. He emphasized the incredibly strong global demand for NVIDIA's AI infrastructure, noting a tenfold surge in AI inference token generation within a year. Huang anticipates that as AI agents become more mainstream, the demand for AI computing will accelerate further. The company's data center revenue reached $39.1 billion, a 73% increase year-over-year, showcasing the impact of Blackwell ramp up and the adoption of accelerated AI inference.

Beyond infrastructure, NVIDIA is also expanding its reach through strategic partnerships. NVIDIA and MediaTek are collaborating to develop an ARM-based mobile APU specifically designed for gaming laptops. This collaboration aims to combine NVIDIA’s graphics expertise with MediaTek’s compute capabilities to create a product that could rival AMD’s Strix Halo. The planned APU will focus on power efficiency and thermal performance, which are crucial for modern gaming laptops with thinner chassis.

Recommended read:
References :
  • blogs.nvidia.com: Since a 7.8-magnitude earthquake hit Syria and Türkiye two years ago — leaving 55,000 people dead, 130,000 injured and millions displaced from their homes — students, researchers and developers have been harnessing the latest AI robotics technologies to increase disaster preparedness in the region.
  • futurumgroup.com: Olivier Blanchard and Daniel Newman at Futurum analyse NVIDIA’s Q1 FY 2026 results.
  • www.club386.com: NVIDIA and MediaTek join forces to build an ARM-based mobile APU combining their compute and graphics expertise.
  • NVIDIA Newsroom: ‘AI Maker, Not an AI Taker’:
  • www.artificialintelligence-news.com: UK tackles AI skills gap through NVIDIA partnership
  • techinformed.com: Nvidia can boost UK’s digital infrastructure, says Huang as Starmer promises £1bn for AI

Heng Chi@AI Accelerator Institute //
AI is revolutionizing data management and analytics across various platforms. Amazon Web Services (AWS) is facilitating the development of high-performance data pipelines for AI and Natural Language Processing (NLP) applications, utilizing services like Amazon S3, AWS Lambda, AWS Glue, and Amazon SageMaker. These pipelines are essential for ingesting, processing, and providing output for training, inference, and decision-making at a large scale, leveraging AWS's scalability, flexibility, and cost-efficiency. AWS's auto-scaling options, seamless integration with ML and NLP workflows, and pay-as-you-go pricing model make it a preferred choice for businesses of all sizes.

Microsoft is simplifying data visualization with its new AI-powered tool, Data Formulator. This open-source application, developed by Microsoft Research, uses Large Language Models (LLMs) to transform data into interesting charts and graphs, even for users without extensive data manipulation and visualization knowledge. Data Formulator differentiates itself with its intuitive user interface and hybrid interactions, bridging the gap between visualization ideas and their actual creation. By supplementing natural language inputs with drag-and-drop interactions, it allows users to express visualization intent, with the AI handling the complex transformations in the background.

Yandex has released Yambda, the world's largest publicly available event dataset, to accelerate recommender systems research and development. This dataset contains nearly 5 billion anonymized user interaction events from Yandex Music, offering a valuable resource for bridging the gap between academic research and industry-scale applications. Yambda addresses the scarcity of large, openly accessible datasets in the field of recommender systems, which has traditionally lagged behind other AI domains due to the sensitive nature and commercial value of behavioral data. Additionally, Dremio is collaborating with Confluent’s TableFlow to provide real-time analytics on Apache Iceberg data, enabling users to stream data from Kafka into queryable tables without manual pipelines, accelerating insights and reducing ETL complexity.

Recommended read:
References :
  • insideAI News: NVIDIA and AMD Devising Export Rules-Compliant Chips for China AI Market
  • futurumgroup.com: Can Dell and NVIDIA’s AI Factory 2.0 Solve Enterprise-Scale AI Infrastructure Gaps?
  • TechHQ: Dell to build Nvidia Vera Rubin supercomputer for US Energy Department
  • techhq.com: Dell to build Nvidia Vera Rubin supercomputer for US Energy Department
  • futurumgroup.com: Can Dell Challenge Public Cloud AI with Its Expanded AI Factory?
  • insidehpc.com: DOE Announces “Doudnaâ€� Dell-NVIDIA Supercomputer at NERSC
  • techxplore.com: US supercomputer named after Nobel laureate Jennifer Doudna to power AI and scientific research
  • AI Accelerator Institute: Building efficient data pipelines for AI and NLP applications in AWS
  • www.dremio.com: Using Dremio with Confluent’s TableFlow for Real-Time Apache Iceberg Analytics
  • www.marktechpost.com: Yandex Releases Yambda: The World’s Largest Event Dataset to Accelerate Recommender Systems

@insidehpc.com //
MiTAC Computing Technology and AMD are strengthening their partnership to deliver cutting-edge solutions for AI, HPC, cloud-native, and enterprise applications. MiTAC will showcase this collaboration at COMPUTEX 2025, highlighting their shared vision for scalable and energy-efficient technologies. This partnership, which began in 2002, leverages AMD EPYC processors and Instinct GPUs to meet the evolving demands of modern data centers. Rick Hwang, President of MiTAC Computing Technology, emphasized their excitement in advancing server solutions powered by AMD's latest processors and GPUs, stating that it's key to unlocking new capabilities for their global customer base in AI and HPC infrastructure.

Specifically, MiTAC and AMD are developing next-generation server platforms. One notable product is an 8U server equipped with dual AMD EPYC 9005 Series processors and support for up to 8 AMD Instinct MI325X GPUs, offering exceptional compute density and up to 6TB of DDR5-6400 memory, ideal for large-scale AI model training and scientific applications. Additionally, they are offering a 2U dual-socket GPU server, that supports up to four dual-slot GPUs with 24 DDR5-6400 RDIMM slots and tool-less NVMe storage carriers, it offers high-speed throughput and flexibility for deep learning and HPC environments.

Meanwhile, Nvidia is preparing to compete with Huawei in the Chinese AI chip market by releasing a budget-friendly AI chip. This strategy is driven by the need to maintain relevance in the face of growing domestic competition and navigate export restrictions. The new chip, priced between $6,500 and $8,000, represents a significant cost reduction compared to the previously banned H20 model. This reduction involves trade-offs, such as using Nvidia's RTX Pro 6000D foundation with standard GDDR7 memory and foregoing Taiwan Semiconductor's advanced CoWoS packaging technology.

Recommended read:
References :
  • insideAI News: MiTAC Computing Technology Corporation, a server platform designer and manufacturer, will showcase its strategic collaboration with AMD at COMPUTEX 2025 (Booth M1110).
  • www.artificialintelligence-news.com: Nvidia is preparing to go head-to-head with Huawei to maintain its relevance in the booming AI chip market of China.
  • insidehpc.com: MiTAC Computing Technology Corporation, a server platform designer and manufacturer, will showcase its strategic collaboration with AMD at COMPUTEX 2025 (Booth M1110).

Stephen Warwick@tomshardware.com //
OpenAI is significantly expanding its AI infrastructure, with the launch of Stargate UAE marking the first international deployment of its Stargate AI platform. This expansion begins with a 1GW cluster in Abu Dhabi and is the first partnership under the OpenAI for Countries initiative, aimed at helping governments build sovereign AI capabilities. OpenAI says that coordination with the U.S. government was vital in making the expansion possible, highlighting the importance of democratic values, open markets, and trusted partnerships in this endeavor. The partnership includes reciprocal UAE investment into the U.S. Stargate infrastructure.

This ambitious project also promises new opportunities for developers. The OpenAI Responses API is now the first truly agentic API, which allows developers to combine MCP servers, code interpreter, reasoning, web search, and RAG - all within a single API call. This unified approach is set to enable the creation of a new generation of AI agents, streamlining the development process and expanding the capabilities of AI applications.

Recent details have emerged about Jony Ive and Sam Altman's collaboration on an AI device, codenamed "io," which OpenAI has acquired for $6.5 billion. The device is envisioned as a "central facet of using OpenAI," with Altman suggesting that subscribers to ChatGPT could receive new computers directly from the company. The aim is to create an AI "companion" that is entirely aware of a user’s surroundings and life, potentially evolving into a family of devices.

Recommended read:
References :