News from the AI & ML world

DeeperML - #aiinfrastructure

Joe DeLaere@NVIDIA Technical Blog //
NVIDIA has unveiled NVLink Fusion, a technology that expands the capabilities of its high-speed NVLink interconnect to custom CPUs and ASICs. This move allows customers to integrate non-NVIDIA CPUs or accelerators with NVIDIA's GPUs within their rack-scale setups, fostering the creation of heterogeneous computing environments tailored for diverse AI workloads. This technology opens up the possibility of designing semi-custom AI infrastructure with NVIDIA's NVLink ecosystem, allowing hyperscalers to leverage the innovations in NVLink, NVIDIA NVLink-C2C, NVIDIA Grace CPU, NVIDIA GPUs, NVIDIA Co-Packaged Optics networking, rack scale architecture, and NVIDIA Mission Control software.

NVLink Fusion enables users to deliver top performance scaling with semi-custom ASICS or CPUs. As hyperscalers are already deploying full NVIDIA rack solutions, this expansion caters to the increasing demand for specialized AI factories, where diverse accelerators work together at rack-scale with maximal bandwidth and minimal latency to support the largest number of users in the most power-efficient way. The advantage of using NVLink for CPU-to-GPU communications is that it offers 14x higher bandwidth compared to PCIe 5.0 (128 GB/s). The technology will be offered in two configurations. The first will be for connecting custom CPUs to Nvidia GPUs.

NVIDIA CEO Jensen Huang emphasized that AI is becoming a fundamental infrastructure, akin to the internet and electricity. He envisions an AI infrastructure industry worth trillions of dollars, powered by AI factories that produce valuable tokens. NVIDIA's approach involves expanding its ecosystem through partnerships and platforms like CUDA-X, which is used across a range of applications. NVLink Fusion is a crucial part of this vision, enabling the construction of semi-custom AI systems and solidifying NVIDIA's role at the center of AI development.

Recommended read:
References :
  • The Register - Software: Nvidia opens up speedy NVLink interconnect to custom CPUs, ASICs
  • www.techmeme.com: Nvidia unveils NVLink Fusion, letting customers use its NVLink to pair non-Nvidia CPUs or accelerators with Nvidia's products in their own rack-scale setups (Bloomberg)
  • NVIDIA Technical Blog: Integrating Semi-Custom Compute into Rack-Scale Architecture with NVIDIA NVLink Fusion
  • Tom's Hardware: Nvidia announces NVLink Fusion to allow custom CPUs and AI Accelerators to work with its products Nvidia's NVLink Fusion program allows customers to use the company’s key NVLink tech for their own custom rack-scale designs with non-Nvidia CPUs or accelerators in tandem with Nvidia’s products.

@blogs.nvidia.com //
NVIDIA's CEO, Jensen Huang, has presented a bold vision for the future of technology, forecasting that the artificial intelligence infrastructure industry will soon be worth trillions of dollars. Huang emphasized AI's transformative potential across all sectors globally during his Computex 2025 keynote in Taipei. He envisions AI becoming as essential as electricity and the internet, necessitating "AI factories" to produce valuable tokens by applying energy. These factories are not simply data centers but sophisticated environments that will drive innovation and growth.

NVIDIA is actively working to solidify its position as a leader in this burgeoning AI landscape. A key strategy involves expanding its research and development footprint, with plans to establish a new R&D center in Shanghai. This initiative, proposed during a meeting with Shanghai Mayor Gong Zheng, includes leasing additional office space to accommodate current staff and future expansion. The Shanghai center will focus on tailoring AI solutions for Chinese clients and contributing to global R&D efforts in areas such as chip design verification, product optimization, and autonomous driving technologies, with the Shanghai government expressing initial support for the project.

Furthermore, NVIDIA is collaborating with Foxconn and the Taiwan government to construct an AI factory supercomputer, equipped with 10,000 NVIDIA Blackwell GPUs. This AI factory will provide state-of-the-art infrastructure to researchers, startups, and industries, significantly expanding AI computing availability and fueling innovation within Taiwan's technology ecosystem. Huang highlighted the importance of Taiwan in the global technology ecosystem, noting that NVIDIA is helping build AI not only for the world but also for Taiwan, emphasizing the strategic partnerships and investments crucial for realizing the AI-powered future.

Recommended read:
References :
  • NVIDIA Blog: NVIDIA CEO Jensen Huang took the stage at a packed Taipei Music Center Monday to kick off COMPUTEX 2025, captivating the audience of more than 4,000 with a vision for a technology revolution that will sweep every country
  • TechNode: NVIDIA reportedly plans to establish research center in Shanghai
  • SiliconANGLE: At Computex, Nvidia debuts AI GPU compute marketplace, NVLink Fusion and the future of humanoid AI
  • AI News | VentureBeat: Foxconn builds AI factory in partnership with Taiwan and Nvidia

Pomi Lee@NVIDIA Technical Blog //
NVIDIA CEO Jensen Huang unveiled an ambitious vision for the future of AI at COMPUTEX 2025, declaring AI as the next major technology poised to transform every industry and country. He emphasized the need for "AI factories," describing them as specialized data centers that produce valuable "tokens" by applying energy. Huang highlighted NVIDIA's CUDA-X platform and its partnerships, showcasing how these are driving advancements in areas such as 6G development and quantum supercomputing. He stressed the importance of Taiwan in the global technology ecosystem.

NVIDIA is expanding its AI ecosystem by opening up its high-speed NVLink interconnect technology to custom CPUs and ASICs via NVLink Fusion. This move allows for greater integration of custom compute solutions into rack-scale architectures. The NVLink fabric, known for its high bandwidth capabilities, facilitates seamless communication between GPUs and CPUs, offering a significant advantage over PCIe 5.0. Nvidia is allowing semi-custom accelerator designs to take advantage of the high-speed interconnect - even for non-Nvidia-designed accelerators.

NVIDIA and Foxconn are partnering with the Taiwan government to construct an AI factory supercomputer, equipped with 10,000 NVIDIA Blackwell GPUs, to support local researchers and enterprises. This supercomputer, facilitated by Foxconn's Big Innovation Company, will provide AI cloud computing resources to the Taiwan technology ecosystem. This collaboration aims to accelerate AI development and adoption across various sectors, reinforcing Taiwan's position as a key player in the global AI landscape.

Recommended read:
References :

@Dataconomy //
Databricks has announced its acquisition of Neon, an open-source database startup specializing in serverless Postgres, in a deal reportedly valued at $1 billion. This strategic move is aimed at enhancing Databricks' AI infrastructure, specifically addressing the database bottleneck that often hampers the performance of AI agents. Neon's technology allows for the rapid creation and deployment of database instances, spinning up new databases in milliseconds, which is critical for the speed and scalability required by AI-driven applications. The integration of Neon's serverless Postgres architecture will enable Databricks to provide a more streamlined and efficient environment for building and running AI agents.

Databricks plans to incorporate Neon's scalable Postgres offering into its existing big data platform, eliminating the need to scale separate server and storage components in tandem when responding to AI workload spikes. This resolves a common issue in modern cloud architectures where users are forced to over-provision either compute or storage to meet the demands of the other. With Neon's serverless architecture, Databricks aims to provide instant provisioning, separation of compute and storage, and API-first management, enabling a more flexible and cost-effective solution for managing AI workloads. According to Databricks, Neon reports that 80% of its database instances are provisioned by software rather than humans.

The acquisition of Neon is expected to give Databricks a competitive edge, particularly against competitors like Snowflake. While Snowflake currently lacks similar AI-driven database provisioning capabilities, Databricks' integration of Neon's technology positions it as a leader in the next generation of AI application building. The combination of Databricks' existing data intelligence platform with Neon's serverless Postgres database will allow for the programmatic provisioning of databases in response to the needs of AI agents, overcoming the limitations of traditional, manually provisioned databases.

Recommended read:
References :
  • Databricks: Today, we are excited to announce that we have agreed to acquire Neon, a developer-first, serverless Postgres company.
  • www.infoworld.com: Databricks to acquire open-source database startup Neon to build the next wave of AI agents
  • www.bigdatawire.com: Databricks Nabs Neon to Solve AI Database Bottleneck
  • Dataconomy: Databricks has agreed to acquire Neon, an open-source database startup, for approximately $1 billion.
  • BigDATAwire: Databricks today announced its intent to buy Neon, a database startup founded by Nikita Shamgunov that develops a serverless and infinitely scalable version of the open source Postgres database.
  • Techzine Global: Neon’s technology can spin up a Postgres instance in less than 500 milliseconds, which is crucial for AI agents’ fast working methods.
  • AI News | VentureBeat: The $1 Billion database bet: What Databricks’ Neon acquisition means for your AI strategy
  • analyticsindiamag.com: Databricks to Acquire Database Startup Neon for $1 Billion

Evan Ackerman@IEEE Spectrum //
References: betanews.com , IEEE Spectrum , BetaNews ...
Amazon has unveiled Vulcan, an AI-powered robot with a sense of touch, designed for use in its fulfillment centers. This groundbreaking robot represents a "fundamental leap forward in robotics," according to Amazon's director of applied science, Aaron Parness. Vulcan is equipped with sensors that allow it to "feel" the objects it is handling, enabling capabilities previously unattainable for Amazon robots. This sense of touch allows Vulcan to manipulate objects with greater dexterity and avoid damaging them or other items nearby.

Vulcan operates using "end of arm tooling" that includes force feedback sensors. These sensors enable the robot to understand how hard it is pushing or holding an object, ensuring it remains below the damage threshold. Amazon says that Vulcan can easily manipulate objects to make room for whatever it’s stowing, because it knows when it makes contact and how much force it’s applying. Vulcan helps to bridge the gap between humans and robots, bringing greater dexterity to the devices.

The introduction of Vulcan addresses a significant challenge in Amazon's fulfillment centers, where the company handles a vast number of stock-keeping units (SKUs). While robots already play a crucial role in completing 75% of Amazon orders, Vulcan fills the ability gap of previous generations of robots. According to Amazon, one business per second is adopting AI, and Vulcan demonstrates the potential for AI and robotics to revolutionize warehouse operations. Amazon did not specify how many jobs the Vulcan model may create or displace.

Recommended read:
References :
  • betanews.com: Amazon unveils Vulcan, a package sorting, AI-powered robot with a sense of touch
  • IEEE Spectrum: Amazon’s Vulcan Robots Now Stow Items Faster Than Humans
  • www.linkedin.com: Amazon’s Vulcan Robots Are Mastering Picking Packages
  • BetaNews: Amazon has unveiled Vulcan, a package sorting, AI-powered robot with a sense of touch
  • techstrong.ai: Amazon’s Vulcan Has the ‘Touch’ to Handle Most Packages
  • eWEEK: Amazon’s Vulcan Robot with Sense of Touch: ‘Fundamental Leap Forward in Robotics’
  • www.eweek.com: Amazon’s Vulcan Robot with Sense of Touch: ‘Fundamental Leap Forward in Robotics’
  • techstrong.ai: Amazon’s Vulcan Has the ‘Touch’ to Handle Most Packages
  • IEEE Spectrum: Amazon’s Vulcan Robots Are Mastering Picking Packages
  • Dataconomy: This Amazon robot has a sense of feel
  • The Register: Amazon touts Vulcan – its first robot with a sense of 'touch'

@blogs.microsoft.com //
Microsoft is aggressively pursuing an AI-first strategy, aiming to transform business operations for its customers. A key component of this initiative is the development and deployment of agentic AI solutions. According to Microsoft CEO Nadella, a significant portion of Microsoft's code, specifically 20% to 30%, is now generated by AI, showcasing the company's deep integration of AI into its core development processes. This AI-driven approach promises to accelerate innovation and enable businesses to achieve more through autonomous capabilities.

Microsoft has officially launched Recall AI for Windows 11, an AI-powered search feature that captures periodic screenshots of user activity. This feature is available on Copilot+ PCs through the April 2025 non-security preview update. Recall aims to provide AI-driven memory search. Addressing earlier privacy concerns, Microsoft has ensured that Recall is disabled by default, requires opt-in, and encrypts all data locally. Access to this data requires Windows Hello authentication, and users can delete snapshots or block specific apps and websites from being recorded.

To further solidify its commitment to AI, Microsoft is expanding its cloud and AI infrastructure in Europe as part of five digital commitments. The company is also focused on helping organizations modernize their technology stacks to leverage AI effectively. According to a 2024 Forrester study, continuous modernization, including the incorporation of generative AI, is critical for driving competitive advantage. By setting a strong cloud foundation and embracing continuous migration and modernization, businesses can unlock the full potential of AI and remain competitive in a rapidly evolving technological landscape.

Recommended read:
References :
  • blogs.microsoft.com: How agentic AI is driving AI-first business transformation for customers to achieve more
  • www.microsoft.com: Accelerate AI innovation and business transformation: Scaling AI transformation with strategic cloud partnership

editors@tomshardware.com (Hassam@tomshardware.com //
Microsoft CEO Satya Nadella has revealed that Artificial Intelligence is playing an increasingly significant role in the company's software development. Speaking at Meta's LlamaCon conference, Nadella stated that AI now writes between 20% and 30% of the code in Microsoft's repositories and projects. This underscores the growing influence of AI in revolutionizing software creation, especially for repetitive and data-heavy tasks, leading to efficiency gains. Nadella mentioned that AI is showing more promise in generating Python code compared to C++, due to Python's simpler syntax and better memory management.

Microsoft's embrace of AI in coding aligns with similar trends observed at other tech giants like Google, where AI is reported to generate over 30% of new code. The use of AI in code generation also brings forth concerns about job displacement for new programmers. Despite these anxieties, industry experts highlight the importance of software developers adapting to and leveraging AI tools, rather than ignoring them. Nadella emphasized that while AI can produce code, senior developer oversight remains critical to ensure the stability and reliability of the production environment.

Beyond its internal use, Microsoft is also making strategic moves to expand its cloud and AI infrastructure in Europe. This commitment to the European market includes pledges to fight for its European customers in U.S. courts if necessary, highlighting the importance of trans-Atlantic ties and digital resilience. Microsoft is dedicated to ensuring open access to its AI and cloud platform across Europe, and will be enhancing its AI Access Principles in the coming months. Furthermore, Microsoft is releasing the 2025 Work Trend Index, designed to help leaders and employees navigate the shifting landscape brought about by AI.

Recommended read:
References :
  • news.microsoft.com: Microsoft Releases 2025 Work Trend Index: The Frontier Firm Emerges in Singapore
  • The Microsoft Cloud Blog: Accelerate AI innovation and business transformation: Scaling AI transformation with strategic cloud partnership
  • www.tomshardware.com: Satya Nadella revealed that AI writes as much as 20% to 30% of the code in Microsoft's repositories and projects.
  • TechCrunch: Microsoft CEO says up to 30% of the company’s code was written by AI.
  • Entrepreneur: AI is already writing about 30% of code at Microsoft, Google, and Meta.
  • PCWorld: Microsoft's CEO claims 30% of its new code is written by AI.
  • blogs.microsoft.com: Microsoft is announcing five digital commitments to Europe, starting with an expansion of our cloud and AI infrastructure in Europe.
  • CIO Dive - Latest News: Microsoft expands European footprint amid global trade tensions
  • PCMag Middle East ai: Microsoft Says Up to 30% of Its Code Now Written by AI, Meta Aims For 50% in 2026
  • SiliconANGLE: Satya Nadella says AI is now writing 30% of Microsoft’s code but real change is still many years away
  • The Register - Software: 30 percent of some Microsoft code now written by AI - especially the new stuff
  • siliconangle.com: Satya Nadella says AI is now writing 30% of Microsoft’s code but real change is still many years away
  • MarkTechPost: Microsoft AI Released Phi-4-Reasoning: A 14B Parameter Open-Weight Reasoning Model that Achieves Strong Performance on Complex Reasoning Tasks
  • Analytics Vidhya: Microsoft Launches Two Powerful Phi-4 Reasoning Models
  • www.windowscentral.com: Satya Nadella says AI already writes 30% of Microsoft's code — but Bill Gates claims software development is too complex to be fully automated
  • The Next Platform: AI Steady, Cloud Accelerating Gives Microsoft A Big Datacenter Boost

NVIDIA Newsroom@NVIDIA Blog //
Nvidia has announced a major initiative to manufacture its AI supercomputers entirely within the United States. The company aims to produce up to $500 billion worth of AI infrastructure in the U.S. over the next four years, partnering with major manufacturing firms like Taiwan Semiconductor Manufacturing Co (TSMC), Foxconn, Wistron, Amkor, and SPIL. This move marks the first time Nvidia will carry out chip packaging and supercomputer assembly entirely within the United States. The company sees this effort as a way to meet the increasing demand for AI chips, strengthen its supply chain, and boost resilience.

Nvidia is commissioning over a million square feet of manufacturing space to build and test Blackwell chips in Arizona and assemble AI supercomputers in Texas. Production of Blackwell chips has already begun at TSMC’s chip plants in Phoenix, Arizona. The company is also constructing supercomputer manufacturing plants in Texas, partnering with Foxconn in Houston and Wistron in Dallas, with mass production expected to ramp up within the next 12-15 months. These facilities are designed to support the deployment of "gigawatt AI factories", data centers specifically built for processing artificial intelligence.

CEO Jensen Huang emphasized the significance of bringing AI infrastructure manufacturing to the U.S., stating that "The engines of the world’s AI infrastructure are being built in the United States for the first time." Nvidia also plans to deploy its own technologies to optimize the design and operation of the new facilities, utilizing platforms like Omniverse to simulate factory operations and Isaac GR00T to develop automated robotics systems. The company said domestic production could help drive long-term economic growth and job creation.

Recommended read:
References :
  • Reid Burke: NVIDIA is working with its manufacturing partners to design and build factories that, for the first time, will produce NVIDIA AI supercomputers entirely in the U.S.
  • The Register - Software: Nvidia wants to build and sell up to half a trillion US dollars of American-made AI supercomputer equipment over the next four years, with the help of Taiwan Semiconductor Manufacturing Co, aka TSMC, and its partners.
  • TechInformed: Nvidia has announced plans to manufacture AI supercomputers in the United States for the first time.
  • AIwire: Nvidia Begins US Production of Blackwell Chips, AI Systems to Follow
  • www.tomshardware.com: Nvidia aims to build $500 billion worth of AI servers in the USA by 2029
  • www.techrepublic.com: NVIDIA’s Vision For AI Factories – ‘Major Trend in the Data Center World’
  • www.tomshardware.com: Made in the USA: Inside Nvidia's $500 billion server gambit
  • www.theguardian.com: Jensen Huang causes stir on social media and is reported to have met founder of AI company DeepSeek The chief executive of the American chip maker Nvidia visited Beijing on Thursday, days after the US on sales of the only AI chip it was still allowed to sell to China.

NVIDIA Newsroom@NVIDIA Blog //
Nvidia has announced plans to manufacture its AI supercomputers entirely within the United States, marking the first time the company will conduct chip packaging and supercomputer assembly domestically. The move, driven by increasing global demand for AI chips and the potential impact of tariffs, aims to establish a resilient supply chain and bolster the American AI ecosystem. Nvidia is partnering with major manufacturing firms including TSMC, Foxconn, and Wistron to construct and operate these facilities.

Mass production of Blackwell chips has already commenced at TSMC's Phoenix, Arizona plant. Nvidia is constructing supercomputer manufacturing plants in Texas, partnering with Foxconn in Houston and Wistron in Dallas. These facilities are expected to ramp up production within the next 12-15 months. More than a million square feet of manufacturing space has been commissioned to build and test NVIDIA Blackwell chips in Arizona and AI supercomputers in Texas.

The company anticipates producing up to $500 billion worth of AI infrastructure in the U.S. over the next four years through these partnerships. This includes designing and building "gigawatt AI factories" to produce NVIDIA AI supercomputers completely within the US. CEO Jensen Huang stated that American manufacturing will help meet the growing demand for AI chips and supercomputers, strengthen the supply chain and improve resiliency. The White House has lauded Nvidia's decision as "the Trump Effect in action".

Recommended read:
References :
  • Reid Burke: NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time
  • insideAI News: NVIDIA today said it is working with manufacturing partners to design and build factories that will produce NVIDIA AI supercomputers — i.e., “AI factories†— entirely in the United States… NVIDIA said that within four years, it plans to produce up to half a trillion dollars worth of AI infrastructure in the U.S. through partnerships ....
  • AIwire: Nvidia Begins US Production of Blackwell Chips, AI Systems to Follow
  • www.theguardian.com: Nvidia says it will build up to $500bn of US AI infrastructure as chip tariff looms
  • www.tomshardware.com: Nvidia aims to build $500 billion worth of AI servers in the USA by 2029
  • Analytics India Magazine: NVIDIA to Manufacture First American-Made AI Supercomputers
  • THE DECODER: Nvidia shifts AI production to US amid changing trade landscape
  • AI News | VentureBeat: Nvidia pledges to build its own factories in the U.S. for the first time to make AI supercomputers
  • www.cnbc.com: Nvidia to mass produce AI supercomputers in Texas as part of $500 billion U.S. push
  • the-decoder.com: Nvidia shifts AI production to US amid changing trade landscape
  • NVIDIA Newsroom: Everywhere, All at Once: NVIDIA Drives the Next Phase of AI Growth
  • blogs.nvidia.com: NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time
  • CIO Dive - Latest News: Nvidia pledges to invest up to $500B in US chip manufacturing
  • analyticsindiamag.com: NVIDIA to Manufacture First American-Made AI Supercomputers

NVIDIA Newsroom@NVIDIA Blog //
Nvidia has announced plans to manufacture AI supercomputers entirely within the United States for the first time. The company is working with manufacturing partners to design and build factories that will produce NVIDIA AI supercomputers – or "AI factories" – on U.S. soil. This initiative includes a projected investment of up to $500 billion over the next four years and aims to establish a comprehensive domestic supply chain for AI infrastructure. Nvidia's partners include industry giants such as TSMC, Foxconn, Wistron, Amkor, and SPIL, deepening their ties with NVIDIA while expanding their global footprint and enhancing supply chain resilience.

The company has already commissioned over a million square feet of manufacturing space in Arizona and Texas to build and test NVIDIA Blackwell chips and assemble AI supercomputers. NVIDIA Blackwell chips have started production at TSMC’s chip plants in Phoenix, Arizona. NVIDIA is building supercomputer manufacturing plants in Texas, with Foxconn in Houston and with Wistron in Dallas. Mass production at both plants is expected to ramp up in the next 12-15 months. Amkor and SPIL will handle packaging and testing operations in Arizona. This move marks the first time Nvidia will be building AI supercomputers entirely in the US.

Jensen Huang, founder and CEO of NVIDIA, emphasized the strategic importance of this initiative. "The engines of the world's AI infrastructure are being built in the United States for the first time," Huang stated. "Adding American manufacturing helps us better meet the incredible and growing demand for AI chips and supercomputers, strengthens our supply chain, and boosts our resiliency." NVIDIA also intends to deploy its own technologies, such as Omniverse and Isaac GR00T, to optimize factory operations and automate manufacturing processes. Manufacturing NVIDIA AI chips and supercomputers for American AI factories is expected to create hundreds of thousands of jobs and drive trillions of dollars in economic security over the coming decades.

Recommended read:
References :
  • Reid Burke: NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time
  • www.tomshardware.com: Nvidia aims to build $500 billion worth of AI servers in the USA by 2029
  • insidehpc.com: NVIDIA to Manufacture AI Supercomputers in U.S.
  • PCMag Middle East ai: To Avoid Tariffs, Nvidia to Build Supercomputer Factories in Texas
  • AI News | VentureBeat: Nvidia pledges to build its own factories in the U.S. for the first time to make AI supercomputers
  • Maginative: NVIDIA has announced plans to manufacture AI supercomputers entirely within the United States for the first time, committing up to half a trillion dollars to domestic production through partnerships with TSMC, Foxconn, and others.
  • insideAI News: NVIDIA today said it is working with manufacturing partners to design and build factories that will produce NVIDIA AI supercomputers — i.e., “AI factories†— entirely in the United States
  • Flipboard Tech Desk: Nvidia says it has commissioned more than a million square feet of manufacturing space to build and test AI chips in Arizona and Texas as part of an effort to move a portion of production to the U.S.
  • Policy ? Ars Technica: Amid Trump tariff chaos, Nvidia launches AI chip production on US soil
  • www.cnbc.com: Nvidia to mass produce AI supercomputers in Texas as part of $500 billion U.S. push
  • analyticsindiamag.com: NVIDIA to Manufacture First American-Made AI Supercomputers
  • The Register - Software: Nvidia joins made-in-America party, hopes to flog $500B in homegrown AI supers by 2029
  • AIwire: Nvidia Begins US Production of Blackwell Chips, AI Systems to Follow
  • www.theguardian.com: The Guardian article on Nvidia's plan to build AI infrastructure in the US.
  • TechInformed: Nvidia has announced plans to manufacture AI supercomputers in the United States for the first time. Over the next four years, the company said that… The post appeared first on .
  • the-decoder.com: Nvidia shifts AI production to US amid changing trade landscape
  • NVIDIA Newsroom: Everywhere, All at Once: NVIDIA Drives the Next Phase of AI Growth
  • blogs.nvidia.com: NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time
  • CIO Dive - Latest News: Article on Nvidia's investment of up to $500 billion in US chip manufacturing.
  • blogs.nvidia.com: Every company and country wants to grow and create economic opportunity — but they need virtually limitless intelligence to do so.
  • insidehpc.com: Nvidia's plans to manufacture AI supercomputers in the US is a move to avoid tariffs and strengthen its supply chain.
  • www.tomshardware.com: Nvidia's ambitious plan to establish a domestic AI supply chain in the US looks challenging but could reshape the AI industry.
  • techdator.net: NVIDIA Teams Up with U.S. Companies to Build AI Supercomputers at Home
  • THE DECODER: Nvidia shifts AI production to US amid changing trade landscape
  • www.techrepublic.com: During a keynote at Data Center World 2025, a NVIDIA exec spoke about AI factories, purpose-built for gigawatt-scale data centers.
  • techinformed.com: Nvidia to manufacture AI supercomputers in the US

@cloud.google.com //
Google Cloud is advancing its AI Hypercomputer with the introduction of Ironwood TPUs, the seventh generation of Tensor Processing Units, designed specifically for AI inference workloads. This integrated supercomputing system combines optimized hardware, open software, and flexible consumption models to deliver high intelligence per dollar for AI workloads. Google Cloud CEO Thomas Kurian highlights that AI has driven adoption of different parts of the platform, enabling companies to perform super-scaled training or inference of their own models. The AI Hypercomputer underpins nearly every AI workload running on Google Cloud, from Vertex AI to direct access for fine-grained control.

Advances in performance-optimized hardware are central to this innovation. Ironwood boasts 5x more peak compute capacity and 6x the high-bandwidth memory (HBM) capacity compared to the prior-generation, Trillium. It comes in two configurations: 256 chips or 9,216 chips, with the larger pod delivering 42.5 exaFLOPS of compute. Moreover, Ironwood is twice as power efficient compared to Trillium, offering significantly more value per watt. Alongside Ironwood, Google Cloud offers A4 and A4X VMs, featuring NVIDIA B200 and GB200 NVL72 GPUs, respectively. These advancements are supported by enhanced networking, including 400G Cloud Interconnect and Cross-Cloud Interconnect, providing up to 4x more bandwidth than the previous 100G offering.

The new Ironwood TPUs are purpose-built for the age of inference, reflecting the increasing focus on deploying AI models. Ironwood incorporates an enhanced SparseCore, which accelerates sparse operations common in ranking and retrieval-based workloads, improving both latency and power consumption. As AI workloads shift from training to inference, Ironwood's design meets the demands of low-latency and high-throughput performance. This new TPU is integrated into Google's AI Hypercomputer, offering developers access through optimized stacks across PyTorch and JAX.

Recommended read:
References :
  • Compute: Introducing Ironwood TPUs and new innovations in AI Hypercomputer
  • www.marktechpost.com: Google AI Introduces Ironwood: A Google TPU Purpose-Built for the Age of Inference
  • BigDATAwire: Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
  • cloud.google.com: Today's innovation isn't born in a lab or at a drafting board; it's built on the bedrock of AI infrastructure.

staff@insideAI News //
Google Cloud has unveiled its seventh-generation Tensor Processing Unit (TPU), named Ironwood, at the recent Google Cloud Next 2025 conference. This new custom AI accelerator is specifically designed for inference workloads, marking a shift in Google's AI chip development strategy. Ironwood aims to meet the growing demands of "thinking models" like Gemini 2.5, addressing the increasing shift from model training to inference observed across the industry. According to Amin Vahdat, Google's Vice President and General Manager of ML, Systems, and Cloud AI, the aim is to enter the "age of inference" where AI agents proactively retrieve and generate data for insights.

Ironwood's technical specifications are impressive, offering substantial computational power and efficiency. When scaled to a pod of 9,216 chips, it can deliver 42.5 exaflops of compute, surpassing the world's fastest supercomputer, El Capitan, by more than 24 times. Each individual Ironwood chip boasts a peak compute of 4,614 teraflops. To manage the communication demands of modern AI, each Ironwood setup features Inter-Chip Interconnect (ICI) networking spanning nearly 10 MW and each chip is equipped with 192GB of High Bandwidth Memory (HBM) and memory bandwidth that reaches 7.2 terabits per second.

This focus on inference is a response to the evolving AI landscape where proactive AI agents are becoming more prevalent. Ironwood is engineered to minimize data movement and latency on-chip while executing massive tensor manipulations, crucial for handling large language models and advanced reasoning tasks. Google emphasizes that Ironwood offers twice the performance per watt compared to its predecessor, Trillium, and is nearly 30 times more power efficient than Google’s first Cloud TPU from 2018, addressing the critical need for power efficiency in modern data centers.

Recommended read:
References :
  • insideAI News: Google Launches ‘Ironwood’ 7th Gen TPU for Inference
  • venturebeat.com: Google's new Ironwood chip is 24x more powerful than the world’s fastest supercomputer
  • www.bigdatawire.com: Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
  • The Next Platform: With “Ironwood†TPU, Google Pushes The AI Accelerator To The Floor
  • www.itpro.com: Google Cloud Next 2025: Targeting easy AI

staff@insideAI News //
Google Cloud has unveiled its seventh-generation Tensor Processing Unit (TPU), named Ironwood. This custom AI accelerator is purpose-built for inference, marking a shift in Google's AI chip development strategy. While previous TPUs handled both training and inference, Ironwood is designed to optimize the deployment of trained AI models for making predictions and generating responses. According to Google, Ironwood will allow for a new "age of inference" where AI agents proactively retrieve and generate data, delivering insights and answers rather than just raw data.

Ironwood boasts impressive technical specifications. When scaled to 9,216 chips per pod, it delivers 42.5 exaflops of computing power. Each chip has a peak compute of 4,614 teraflops, accompanied by 192GB of High Bandwidth Memory. The memory bandwidth reaches 7.2 terabits per second per chip. Google highlights that Ironwood delivers twice the performance per watt compared to its predecessor and is nearly 30 times more power-efficient than Google's first Cloud TPU from 2018.

The focus on inference highlights a pivotal shift in the AI landscape. The industry has seen extensive development of large foundation models, and Ironwood is designed to manage the computational demands of these complex "thinking models," including large language models and Mixture of Experts (MoEs). Its architecture includes a low-latency, high-bandwidth Inter-Chip Interconnect (ICI) network to support coordinated communication at full TPU pod scale. The new TPU scales up to 9,216 liquid-cooled chips. This innovation is aimed at applications requiring real-time processing and predictions, and promises higher intelligence at lower costs.

Recommended read:
References :
  • insidehpc.com: Google Cloud today introduced its seventh-generation Tensor Processing Unit, "Ironwood," which the company said is it most performant and scalable custom AI accelerator and the first designed specifically for inference.
  • www.bigdatawire.com: Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
  • www.nextplatform.com: With “Ironwood†TPU, Google Pushes The AI Accelerator To The Floor
  • insideAI News: Google today introduced its seventh-generation Tensor Processing Unit, “Ironwood,†which the company said is it most performant and scalable custom AI accelerator and the first designed specifically for inference.
  • venturebeat.com: Google's new Ironwood chip is 24x more powerful than the world's fastest supercomputer.
  • BigDATAwire: Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
  • insidehpc.com: Google Cloud today introduced its seventh-generation Tensor Processing Unit, "Ironwood," which the company said is it most performant and scalable custom AI accelerator and the first designed specifically for inference.
  • the-decoder.com: Google unveils new AI models, infrastructure, and agent protocol at Cloud Next
  • AI News | VentureBeat: Google’s new Agent Development Kit lets enterprises rapidly prototype and deploy AI agents without recoding
  • Compute: Introducing Ironwood TPUs and new innovations in AI Hypercomputer
  • The Next Platform: With “Ironwood†TPU, Google Pushes The AI Accelerator To The Floor
  • Ken Yeung: Google Pushes Agent Interoperability With New Dev Kit and Agent2Agent Standard
  • The Tech Basic: Details Google Cloud's New AI Chip.
  • insideAI News: Google today introduced its seventh-generation Tensor Processing Unit, "Ironwood," which the company said is it most performant and scalable custom AI accelerator and the first designed specifically for inference.
  • venturebeat.com: Google unveils Ironwood TPUs, Gemini 2.5 "thinking models," and Agent2Agent protocol at Cloud Next '25, challenging Microsoft and Amazon with a comprehensive AI strategy that enables multiple AI systems to work together across platforms.
  • www.marktechpost.com: Google AI Introduces Ironwood: A Google TPU Purpose-Built for the Age of Inference
  • cloud.google.com: Introducing Ironwood TPUs and new innovations in AI Hypercomputer
  • Kyle Wiggers ?: Ironwood is Google’s newest AI accelerator chip

staff@insideAI News //
AMD has announced its acquisition of ZT Systems for $4.9 billion, marking a significant move into the AI infrastructure arena. The acquisition, which was first disclosed last August, was priced at $4.9 billion. AMD anticipates that the transaction will be accretive on a non-GAAP basis by the end of 2025. The acquisition is part of a trend of chip makers integrating with the AI factory.

AMD's acquisition of ZT Systems will enable "a new class of end-to-end AI solutions based on the combination of AMD CPU, GPU and networking silicon, open-source AMD ROCm software and rack-scale systems capabilities.” Forrest Norrod, EVP/GM of AMD's Data Center Solutions business unit, stated that reducing the design and deployment time of cluster-level data center AI systems will be a significant competitive advantage. In addition, prior to this announcement, AMD secured a 30,000 GPU cluster deal with Oracle.

Recommended read:
References :

staff@insideAI News //
References: insideAI News , insidehpc.com ,
Fluidstack announced on March 25, 2025, its collaboration with Borealis Data Center, Dell Technologies, and NVIDIA to deploy and manage exascale GPU clusters across Iceland and Europe. Fluidstack aims to support AI labs, researchers, and enterprises by rapidly deploying high-density GPU supercomputers powered by 100% renewable energy. Borealis Data Center will provide facilities powered by renewable energy in Iceland and the Nordics, leveraging the region's cold climate and geothermal power. Dell PowerEdge XE9680 servers, optimized for AI workloads with NVIDIA HGX H200 and Quantum-2 InfiniBand networking, will be utilized to ensure performance and reliability.

Reports indicate that China's AI data center boom has lost momentum, leaving billions of dollars in idle infrastructure. Triggered by the rise of generative AI applications, China rapidly expanded its AI infrastructure in 2023-2024, constructing hundreds of new data centers with state and private funding. However, many facilities are now underused, returns are falling, and the market for GPU rentals has collapsed. Some data centers became outdated before they were fully operational due to changing market conditions and poor planning.

Recommended read:
References :
  • insideAI News: Fluidstack to Deploy Exascale GPU Clusters in Europe with NVIDIA, Borealis Data Center and Dell
  • insidehpc.com: Fluidstack to Deploy Exascale GPU Clusters in Europe with NVIDIA, Borealis Data Center and Dell
  • www.tomshardware.com: China's AI data center boom goes bust: Rush leaves billions of dollars in idle infrastructure

Jaime Hampton@AIwire //
References: AIwire , AI News , www.tomshardware.com ...
China's multi-billion-dollar AI infrastructure boom is now facing a significant downturn, according to a new report. The rush to build AI datacenters, fueled by the rise of generative AI and encouraged by government incentives, has resulted in billions of dollars in idle infrastructure. Many newly built facilities are now sitting empty, with some reports indicating that up to 80% of China’s new computing resources remain unused.

The "DeepSeek Effect" is a major factor in this reversal. DeepSeek's AI models, particularly the Deepseek v3, have demonstrated impressive efficiency in training, reducing the demand for large-scale datacenter deployments. Smaller players are abandoning plans to pretrain large models because DeepSeek’s open-source models match ChatGPT-level performance at a fraction of the cost, leading to a collapse in demand for training infrastructure just as new facilities were ready to come online.

Recommended read:
References :
  • AIwire: Report: China’s Race to Build AI Datacenters Has Hit a Wall
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • www.tomshardware.com: China's AI data center boom goes bust: Rush leaves billions of dollars in idle infrastructure