Enterprise Adoption Is Slower and Stickier Than the Demos Suggest
Why Memory Bandwidth Is the New Constraint
The headline number everyone fixates on is training compute, but the margin story is increasingly about inference. As model providers push cheaper, faster variants, the cost of serving a query has collapsed by roughly an order of magnitude in eighteen months — and that changes which products are viable to build on top of them.
When a lab ships a new frontier model, the interesting question is rarely whether the benchmark went up. It is whether the price-performance curve shifted enough to unlock a category of application that was previously uneconomical. Watch the pricing page, not the leaderboard.
The bull case for the capex supercycle rests on durable demand and expanding use cases; the bear case rests on the possibility that a great deal of this spending is defensive, undertaken because no incumbent can afford to be the one that under-invested. Both can be true at once, and the timing of the reckoning is the whole game.
Reading an AI earnings call is an exercise in separating booked revenue from backlog from ambition. Signed contracts and committed capacity are real; framework agreements and letters of intent are options on the future. The market routinely conflates the two, and that is where the mispricings live.
Power, not silicon, is emerging as the binding constraint on datacenter expansion. Grid interconnection queues stretch years out in the key regions, and the operators who locked in generation capacity early now hold a structural advantage that will show up in gross margins long before it shows up in the narrative.
Open-weight models keep closing the distance to the closed frontier, and each release compresses the premium that proprietary providers can charge. That does not erase the moat — the frontier still leads on the hardest tasks — but it caps how much of the market the leaders can defend at the low and middle tiers.
Advanced packaging is the constraint hiding behind the constraint. Even when a foundry can print the logic, the number of chips that ship is gated by the capacity to bond memory stacks onto the compute die — and that line is booked out quarters in advance. The packaging vendors quietly set the ceiling on how fast supply can grow.
Memory bandwidth has quietly become the constraint that dictates real-world throughput. You can stack more accelerators, but if the model cannot be fed fast enough, the extra compute sits idle. This is why the high-bandwidth memory roadmap is worth tracking as closely as the flagship chip roadmap.
Sovereign and regional buildouts are becoming a demand source in their own right, driven less by economics than by the desire not to depend on someone else's cloud. That demand is price-insensitive and politically durable, which makes it a floor under accelerator orders that a pure ROI model would miss entirely.
Enjoyed this?
Subscribe to The Compute Brief for new posts in your feed.