Somewhere between the GPU orders and the board-level mandates, the math stopped working. Enterprises worldwide are spending more on AI than they are recovering in documented savings — and the gap is widening every quarter. This is not a fringe take from AI skeptics. It is the conclusion quietly emerging from Goldman Sachs Research, Sequoia Capital, and MIT economics. The question is no longer whether AI can create value. It is whether the infrastructure bill arrives long before the revenue does.

The $600 billion question Sequoia raised

In mid-2024, Sequoia Capital partner David Cahn published an analysis that reframed the entire AI spending debate. He took NVIDIA's data center revenue run-rate — roughly $150 billion annually — and applied standard cloud economics to estimate the end-user revenue required to justify that level of infrastructure investment. The answer: approximately $600 billion per year in AI-generated end-user revenue. The actual figure at the time was a fraction of that number.

Cahn was not predicting a collapse. He was identifying a structural gap. Someone in the chain — hyperscalers, enterprises, or VC-backed startups — is currently carrying costs that have not yet been matched by returns. As he put it: the only way to make the numbers work is to assume AI will be transformative at a scale never seen before. That may be true eventually. It is not the reality in most enterprise deployments today.

Goldman Sachs: too much spend, too little benefit

Goldman Sachs Research published a report with a title that did the work for them: "Gen AI: Too Much Spend, Too Little Benefit?" Jim Covello, Goldman's Head of Global Equity Research, made the central challenge explicit: AI is an extremely expensive technology being deployed to solve problems that did not require that level of cost to fix in the first place.

Covello's argument centres on task economics. The most valuable problems businesses face — strategic judgment, regulatory navigation, high-stakes relationship management — are the ones AI cannot yet reliably handle. The tasks AI does handle well — document summarisation, customer service deflection, code autocomplete — were already addressable with cheaper software. The cost-to-value ratio is inverted at both ends of the spectrum.

MIT economist Daron Acemoglu, interviewed in the same Goldman report, added structural context. Historical technology cycles — electrification, the internet — delivered broad productivity gains because they reduced the cost of doing things that were already valuable. Most current AI deployments do not meet that bar. They automate tasks that were already fast, or they attempt tasks they perform unreliably.

What the actual enterprise numbers show

The gap between AI spend and AI returns is not theoretical. It is showing up in operational data:

The arms race nobody can opt out of

Here is what makes this situation genuinely hard to navigate: the companies doing the spending know the ROI is uncertain, and they are spending anyway. This is rational, not irrational.

As Goldman Sachs Asset Management portfolio managers noted after meeting with 20 leading technology executives: the hyperscalers doing these calculations are not reckless. They see incremental returns. But they are also in an arms race where being the fourth-best frontier model — or the enterprise that skipped infrastructure investment for two years — may be competitively fatal. The spend is partly a bet on the future and partly a defensive posture.

Goldman's Brook Dane described it plainly: "You can't fall off the front end of the wave. There's a bit of an arms race here, and there's a little bit of a leap of faith embedded in that." That leap of faith is being taken with capital budgets that are, in many cases, 10x what they were in 2021.

Why own infrastructure becomes the only rational answer

The cost spiral has a predictable exit point: the companies that will survive it are the ones that stop renting compute and start owning it.

Every dollar spent on API tokens is a dollar that funds the infrastructure of your vendor — infrastructure that will be used to compete with you, or to raise prices once switching costs are high enough. The token price you pay today is subsidised. NVIDIA, Anthropic, and OpenAI are not running charitable operations; they are building the dependency first and monetising it second.

On-premise or private-cloud LLM infrastructure breaks this dynamic. The capital expenditure is front-loaded and visible. The marginal cost of each inference drops toward zero. There is no vendor repricing risk, no data leaving your perimeter, and no structural dependency on a supplier that is simultaneously your competitor.

This is not an option available to every company today — the upfront investment is real, and the operational expertise required is non-trivial. But the trajectory is clear. As open-weight models continue to close the gap with frontier closed models, and as inference hardware becomes more accessible, building your own stack will shift from an enterprise luxury to a competitive necessity.

The timeline that matters

Goldman Sachs Asset Management's Sung Cho framed the ROI debate correctly: over one to two years, the returns may not justify the investment. Over twenty years, they almost certainly will. The problem is that most enterprise budgets operate on a one-to-two-year horizon, and most of the current AI spending is being evaluated against that shorter window.

The companies that will look prescient in 2030 are not the ones that spent the most on API credits in 2025. They are the ones that used 2025 and 2026 to build infrastructure they will own — training pipelines, fine-tuned models, private inference clusters — while their competitors accumulated recurring vendor bills with no equity in the underlying technology.

AI is not getting cheaper for the enterprises consuming it as a service. It is getting more expensive relative to the value they can extract, because the value extraction requires the kind of deep customisation and data integration that API products are structurally incapable of delivering. The companies that understand this early will not just reduce costs. They will build a moat that API-dependent competitors cannot cross.

What this means practically

How to start building your own AI infrastructure

The shift from API consumer to infrastructure owner does not require a hyperscaler budget. The path is incremental, and most enterprises can begin within a single quarter. Here is a practical starting sequence:

A realistic timeline: a mid-size enterprise with one internal technical hire (or an infrastructure partner) can have a private inference cluster running a fine-tuned model in production within six to eight weeks. The payback period on hardware versus equivalent API spend is typically under eighteen months for any team consuming more than 50 million tokens per month.

Where to start if you do not have an internal AI team

The operational expertise gap is the most cited reason enterprises stay on APIs longer than their economics justify. Setting up GPU servers, managing model weights, handling inference optimisation, and maintaining uptime is genuinely non-trivial if you are doing it for the first time.

This is precisely the gap that specialist AI infrastructure partners exist to close. Rather than hiring a full ML engineering team before you have validated the use case, the right starting point is a scoped engagement: an infrastructure partner deploys your first private model, documents the stack, trains your team, and hands over a system you own and operate independently.

The economics shift permanently once that first deployment is live. Every subsequent inference is free from vendor lock-in, priced at marginal electricity and hardware amortisation, and fully under your control.

The AI cost crisis is not a reason to stop investing in AI. It is a reason to invest differently — in infrastructure you own rather than services you rent, in use cases with hard ROI rather than pilots that look impressive in board decks, and in a roadmap that treats today's API spend as a bridge, not a destination.

Ready to stop renting compute?

Adelphos designs and deploys private AI infrastructure for enterprises — from your first fine-tuned model to a full sovereign stack. We handle the architecture, deployment, and handover so your team owns the outcome.

Talk to us about your stack →

Stay ahead of the curve

Get our next deep-dive in your inbox

Share X LinkedIn

Related Reading

Cost Analysis On-Premise LLM vs. OpenAI API: The Honest Cost Breakdown A detailed comparison of self-hosted vs. API costs for enterprise AI. Automation Calculating the Real ROI of Workflow Automation A framework for measuring time, error rates, and opportunity cost.