Skip to main content
Insights5 min read

OpenAI Has Four Chip Vendors. You Probably Have One.

February 21, 2026 by Asif Waliuddin

NVIDIAOpenAISemiconductors
OpenAI Has Four Chip Vendors. You Probably Have One.

OpenAI Has Four Chip Vendors. You Probably Have One.

The $10 billion Cerebras deal got covered as a spending story. "OpenAI buys more compute." That framing is wrong, and it obscures the most important infrastructure decision any AI lab has made this cycle.

OpenAI now has chip partnerships with four separate vendors: Nvidia, AMD, Broadcom, and Cerebras. Four architecturally distinct silicon providers for a single AI system. This is not supply chain diversification. It is an architectural statement about how inference and training workloads should be matched to hardware.

The Hype

The coverage of OpenAI's compute deals follows a predictable script: "AI company raises/spends enormous sum, demonstrating the scale of AI ambition." The Cerebras deal was reported through that lens. Ten billion dollars. 750 megawatts of capacity through 2028. Impressive numbers. Moving on.

The same pattern applied to every prior compute announcement. Nvidia partnership? Scale. AMD deal? Diversification from Nvidia. Broadcom? Custom chip ambitions. Each deal got its own news cycle, its own analysis, and its own framing as an isolated procurement decision.

Nobody synthesized the four into a single architecture story.

The Reality

OpenAI is building a tiered compute stack where different workloads route to different hardware based on their computational profile. This is not speculation -- it is the logical conclusion of their vendor choices.

Cerebras makes wafer-scale engines. Their architecture is fundamentally different from GPU clusters: a single massive chip with no interconnect bottleneck. That design is purpose-built for low-latency inference, not training. When you need to serve 200 million weekly active users with sub-200ms response times, the interconnect overhead of GPU clusters becomes a real engineering constraint. Cerebras solves that by eliminating it. The $10 billion deal and 750MW capacity allocation is OpenAI saying: our inference tier is a separate hardware category from our training tier.

Nvidia remains the training workhorse. The CUDA ecosystem, the networking stack (NVLink, InfiniBand), and the sheer installed base make Nvidia GPUs the default for large-scale training runs. Nobody is training GPT-5 on Cerebras wafer-scale engines. That is not what they are for.

AMD provides cost-efficient scale for workloads where Nvidia's premium pricing does not justify the performance delta. Batch inference, fine-tuning, and mid-tier compute tasks where the ROI math favors cheaper silicon over peak performance.

Broadcom signals custom silicon ambitions -- ASICs designed for specific workload patterns that general-purpose GPUs handle inefficiently.

Put the four together and you get a tiered architecture:

  • Training tier: Nvidia (performance-optimized, CUDA ecosystem)
  • Low-latency inference tier: Cerebras (wafer-scale, no interconnect bottleneck)
  • Cost-optimized inference tier: AMD (price-performance for batch and mid-tier tasks)
  • Specialized workload tier: Broadcom custom silicon (workload-specific ASICs)

This is not four ways of buying the same thing. It is four categories of compute matched to four categories of work.

Why This Matters for Your Stack

OpenAI is projecting $20 billion in annual recurring revenue by 2026. At that revenue scale, inference reliability and cost are existential. Every percentage point of inference cost reduction at 200 million users translates to hundreds of millions in margin. Every millisecond of latency improvement at scale affects user retention.

The four-vendor strategy is the infrastructure answer to that economic reality. Training is a periodic, batch-mode cost. Inference is a continuous, revenue-correlated cost. They require different optimization strategies and, increasingly, different hardware.

Here is the question that should make technical leaders uncomfortable: if OpenAI -- with the best AI infrastructure team on the planet -- concluded that a single GPU vendor cannot optimally serve all their compute workloads, why are you running everything on one hardware type?

Most organizations running AI inference at any meaningful scale are still in a "Nvidia for everything" configuration. That made sense when inference volume was low and the engineering overhead of multi-vendor stacks outweighed the efficiency gains. It stops making sense when inference becomes your primary compute cost.

The transition looks like this:

  1. Single vendor, single tier: All workloads on Nvidia GPUs. Where most teams are today.
  2. Single vendor, multiple tiers: Different Nvidia GPU classes for different workloads (H100 for training, L40 for inference). A reasonable intermediate step.
  3. Multi-vendor, tiered: Purpose-matched hardware for each workload category. Where OpenAI is going.

The jump from stage 2 to stage 3 requires engineering investment in hardware abstraction, workload routing, and multi-vendor orchestration. That investment is only justified at sufficient scale. But "sufficient scale" is arriving faster than most teams expect as inference volumes grow.

What This Means

The four-chip strategy tells you three things about where AI infrastructure is heading:

First, inference and training are diverging as hardware categories. The assumption that the same GPU serves both is an artifact of low inference volume, not an architectural truth. As inference scales, the hardware requirements diverge.

Second, Nvidia's dominance is real but no longer sufficient at the frontier. Being the best training GPU does not automatically make you the best inference chip, the best cost-optimized option, or the best fit for specialized workloads. The market is segmenting.

Third, compute architecture is becoming a competitive advantage, not just a cost center. The team that matches workloads to hardware most efficiently will serve users faster and cheaper than the team that treats all compute as interchangeable. That gap compounds.

The Bottom Line

The Cerebras deal was not a spending story. It was an architecture story. OpenAI has concluded that serving AI at commercial scale requires purpose-built hardware for distinct workload categories, not more of the same GPU. Four vendors, four tiers, four optimization targets.

If you are running an AI inference workload of any meaningful scale and have not yet asked "should different parts of my compute stack be on different hardware?" -- OpenAI just answered that question for you.

The answer is yes.