Table of Contents
Custom inference chips could help drive down soaring costs — but Nvidia isn’t going anywhere soon
While AMD and Nvidia battle for AI accelerator supremacy, hyperscalers and other major players are increasingly turning to custom-designed silicon. Google has its Trillium TPU, Amazon has its Inferentia and Trainium, and Meta has its MTIA chips. Now, another custom chip is on the way — and it’s coming from what may be the most influential AI company of all.
OpenAI and Broadcom have inked a deal to deploy 10 gigawatts of AI accelerators. OpenAI will design the chips and systems, while Broadcom will handle development and deployment, according to a joint press release.
To be clear, it’s difficult to predict the compute needs of OpenAI’s infrastructure over the long term. This deal follows two other deals with OpenAI. The first deal is with Nvidia, and will see Nvidia provide OpenAI with 10 gigawatts worth of AI accelerators. The second deal is with AMD, which we’ll see AMD provide an additional 6GW, starting with its Instinct MI450 accelerators, which will be among the first built on TSMC’s 2nm process.
These deals span multiple years, of course, and their timelines haven’t been made public, beyond that they’ll start next year. But the news does represent a broader trend in the industry as hyperscalers and major players seek to reduce their dependence on what could be considered the traditional AI accelerator providers.
The terms of the deal
So what will this deal actually entail? As mentioned, OpenAI will design the systems while Broadcom will develop and deploy them. In the press release, OpenAI announced that it was targeting an initial deployment in the second half of 2026. OpenAI will be complete with the completion of the deal by the end of 2029.
“Our collaboration with Broadcom will power breakthroughs in AI and bring the technology’s full potential closer to reality,” said OpenAI co-founder and President, Greg Brockman, in OpenAI’s press release. “By building our own chip, we can embed what we’ve learned from creating frontier models and products directly into the hardware, unlocking new levels of capability and intelligence.”
The fact that the deal is with Broadcom does make sense though. Unlike pure chipmakers, Broadcom has expertise in complete end-to-end system deployment — from ASIC development to Ethernet switching, NICs, and optical interconnects. Broadcom has already been a central partner with the likes of Google, Meta, and Amazon, in building application-specific accelerators (ASICs).
But that’s about all we know about the deal so far. We don’t really know anything about the hardware itself, along with how or if the custom silicon will compete with hardware from AMD and Nvidia.
Inference at scale
The chips don’t necessarily need to compete with the likes of AMD and Nvidia, though. A custom inference accelerator opens the door to significantly lower hardware costs compared to state-of-the-art training GPUs like Nvidia’s H100 or AMD’s MI450. These chips are built for flexibility, with oversized compute arrays and massive memory bandwidth — features that make them powerful but extremely expensive.
Inference silicon, by contrast, can strip out much of that overhead. It can use lower-precision math (such as INT8 or FP8), smaller on-chip caches, and simpler interconnects, all tailored to predictable inference workloads. That means lower power draw, reduced cooling requirements, and a smaller bill of materials per rack — translating to lower cost per token at the massive scale OpenAI operates.
But while the huge demand for Nvidia and AMD GPUs will likely be impacted by an increased number of companies designing custom silicon, silicon for training will still remain dominated by the likes of Nvidia for the foreseeable future. If OpenAI’s custom accelerators can handle inference at scale, Nvidia could lose some of the high-margin, high-volume workloads that make up the bulk of its data-center revenue.
In short, OpenAI’s trio of deals — with Nvidia, AMD, and now Broadcom — marks a new phase in AI infrastructure strategy. The GPU giants remain essential, but OpenAI is clearly working to rebalance power across its supply chain and bring more of its compute under its own control.
 
  
  
 