The Deflation Engine Inside the AI Bubble

[PUBLIC]LOG-006 — 2026.06.15

Capability-adjusted inference is collapsing 10-50x a year while the frontier and the total bill inflate. The bubble is real; the cost narrative everyone repeats is exactly backwards. A research-log on lab economics, depreciation, and the code nobody is left to maintain.

▸ section1.log[CLASSIFIED]

THE_COST_STORY_BACKWARDS

Start with the claim that quietly poisons every dinner-table take on AI: "it only gets more expensive." That is true for exactly one curve and false for the one most people think they mean. Price the thing properly — cost to run a fixed level of capability — and the line is in freefall. At the low-capability tier, inference price has fallen roughly 10x per year (a16z 'LLMflation', Nov 2024). To hit a fixed benchmark, the median collapse is about 50x per year (Epoch AI, 2025). GPT-4-equivalent inference is down on the order of 40-60x since March 2023, and H100 rental fell about 64% in a single year. Those are not rounding errors; they are one of the fastest deflations ever recorded in a production input. The confusion is semantic, not factual. "AI" is not one price; it is a quality-adjusted basket, and the basket's contents keep improving while its sticker shrinks. A task that cost a dollar to run in 2023 now costs a couple of cents at the same capability — and meanwhile the dollar buys you a far better model than existed in 2023. Both the numerator (capability) and the denominator (price) are moving in your favor. The intuition that "my AI bill keeps climbing" is real, but it is a statement about how much you are buying, not what a unit costs. One honest caveat keeps the bulls honest too: list prices may be partly loss-subsidized — set below true serving cost to capture share — so true unit cost almost certainly falls slower than the headline 50x. The deflation is genuine; its exact slope is contested. But the direction is not. This is the single most important reframing in the whole debate, and it is the one the bubble narrative reliably omits.

▸Low-tier inference: ~10x/year price decline (a16z 'LLMflation', Nov 2024).
▸Median ~50x/year to reach a fixed benchmark; GPT-4-equivalent down ~40-60x since Mar 2023 (Epoch AI, 2025).
▸H100 GPU rental fell ~64% in a single year — the hardware layer is deflating alongside the software.
▸Caveat: list prices are partly loss-subsidized, so true unit cost falls slower than the headline — direction certain, slope contested.

▸ section2.log[CLASSIFIED]

THE_JEVONS_TRAP

So if every unit is getting dramatically cheaper, why is everyone's invoice going up? Because two other curves are inflating fast, and a 19th-century coal economist already explained the result. William Stanley Jevons noticed that making coal-burning more efficient increased total coal consumption, because efficiency made coal worth using everywhere. The same mechanism is running through AI right now. First, the frontier itself keeps getting more expensive. Training cost grows roughly 2.4x per year — a doubling every eight months or so — heading past $1B per model by 2027 (Epoch AI). Cheaper-to-run does not mean cheaper-to-build; the bleading edge is moving away from you even as the trailing edge collapses toward zero. A vivid marker: OpenAI's o1 runs around $60 per million output tokens — the same price as GPT-3 at its 2021 launch. Capability climbed enormously; the frontier price tag did not move. Second, and more decisively for your bill: reasoning and agentic models burn vastly more tokens per task. Reasoning output length has been growing on the order of 5x per year. When a model "thinks" before answering, or an agent loops through dozens of tool calls, token consumption explodes — and it does so precisely because each token got cheap enough to spend freely. Cheaper tokens are exactly why you now use ten times as many. This is why "AI gets ever more expensive" and "AI gets dramatically cheaper" are both true and not in contradiction. They measure different things: unit price versus total spend, trailing capability versus the frontier. The deflation is real; the rising bill is real; the Jevons effect is the bridge between them. Any analysis that picks one curve and declares victory is selling you half a fact.

▸Frontier training cost grows ~2.4x/year (doubling every ~8 months), heading past $1B/model by 2027 (Epoch AI).
▸o1 costs ~$60/M output tokens — identical to GPT-3's 2021 launch price, despite vastly higher capability.
▸Reasoning/agentic output length growing ~5x/year: cheaper tokens get spent in far larger volumes.
▸Net effect is a textbook Jevons rebound — unit price down, total consumption (and bills) up. Both stories true at once.

▸ ai_bubble_index.ts[LIVE]

AI BUBBLE INDEX

A transparent model, not a live feed. Each indicator is a sourced 'bubble pressure' estimate (0 = healthy, 100 = mania). Type in your own number or change its weight — the needle recomputes live.

62 / 100

ELEVATED

INDICATORS — EDIT A SCORE OR ITS WEIGHT

Valuation extension (CAPE)

S&P 500 CAPE ~40 — just below the ~44 dotcom peak, vs a ~17.6 long-run average

weight1.0×GMO, 'Valuing AI' (2025/26) ↗

Index concentration

Mag-7 ~34% of the S&P 500 (was ~12% in 2015); drove ~54% of 2025 price gains vs ~44% of earnings growth

weight1.0×Apollo / Torsten Sløk (Jan 2026) ↗

Capex vs realized AI revenue

~$600–725B hyperscaler capex in 2026 (+62–77% YoY) vs <$50B direct 'AI-labeled' revenue (though AI-cloud grows 24–48%)

weight1.0×CNBC (Feb 2026) ↗

Frontier lab unit economics

OpenAI ~33% gross margin (vs 70–80% healthy SaaS); Anthropic projects ~50% → ~77% by 2028

weight1.0×Sacra / TechCrunch (2025/26) ↗

Circular / vendor financing

Nvidia up to $100B into OpenAI; OpenAI–Oracle ~$300B; AMD warrant — real chips, but heavy counterparty concentration

weight1.0×The Register (Nov 2025) ↗

Capability-deflation rate

Fixed-quality inference falling ~10–50×/yr; GPT-4-equivalent down ~40–60× since Mar 2023; H100 rental −64% in a year

weight1.0×a16z 'LLMflation' / Epoch AI ↗

Enterprise AI ROI realization

~95% of GenAI pilots show no measurable P&L impact (MIT); 42% of firms abandoned most initiatives in 2025 (vs 17% in 2024)

weight1.0×MIT Project NANDA (Aug 2025) ↗

GPU depreciation stretch

Schedules extended 3 → 5–6 yrs vs ~2–3 yr real economic life; ~$176B of profit possibly overstated 2026–28 (contested)

weight1.0×Burry / Scion (Nov 2025) ↗

AI-firm credit stress

Oracle 5-yr CDS ~198bps (record); ~$108B of AI debt raised in 2025; Bank of England flagged widening spreads

weight1.0×Bank of England FSR (Dec 2025) ↗

Earnings quality of leaders

Nvidia ~47× P/E on ~$120B net income (vs Cisco ~200× in 2000); top-5 tech ~$350B combined free cash flow

weight1.0×Fortune (Jan 2026) ↗

Code maintainability & skill drag

~45% of AI-generated code ships insecure (no better with bigger models); juniors −17% comprehension; US 22–25yo dev jobs −20%

weight1.0×Veracode / Anthropic RCT (2025/26) ↗

EU / Germany competitiveness drag

EU ~5% of global AI compute vs US ~80%; EU venture funding ~22% of US; industrial power ~2–3× US/China

weight1.0×Draghi competitiveness report (Sep 2024) ↗

This index measures financial bubble pressure, not whether AI works. Readings reflect late-2025 / early-2026 data. Two indicators (capability deflation, earnings quality) deliberately point away from the bubble — averaging them in is the point. Each indicator links its source below.

▸ section3.log[CLASSIFIED]

THE_REVENUE_IS_REAL

The lazy bear case — "there's no revenue" — is the one part of the thesis that has actually been falsified. Revenue is compounding violently. OpenAI went from roughly $6B in 2024 to a $20B+ annualized run-rate in 2025: +233% year-over-year, confirmed by CFO Sarah Friar. Anthropic reportedly ran from about $9B to a claimed $40B+ run-rate in a matter of months. These are not vapor metrics; they are among the fastest revenue ramps in software history. The economics underneath, however, are genuinely ugly — and this is where honest bears should plant their flag. OpenAI operates on roughly 33% gross margins (healthy SaaS is 70-80%), lost on the order of $11-13.5B in H1 2025, and its own investor documents project about $115B of cumulative cash burn through 2029 (Fortune, Nov 2025). That is not a rounding problem; that is a business that has not yet proven its unit economics close. Which brings us to the most-repeated and most-abused stat: "under $50B of AI revenue against more than $1T of investment." It is apples-to-oranges twice over. It pits a single year of narrowly-defined "AI-labeled" revenue against multi-year cumulative capex, and it ignores the AI-driven cloud growth sitting one ledger over — AWS +24%, Azure +39%, Google Cloud +48%. A lot of "AI revenue" is showing up as cloud revenue and getting excluded by definition. The real question was never whether demand exists; it plainly does and it is accelerating. The real question is timing of ROI: whether the labs reach durable margins before the burn runs out the clock. That is a much harder, much more interesting bet than "fake revenue."

▸OpenAI: ~$6B (2024) to $20B+ run-rate (2025), +233% YoY, per CFO Sarah Friar; Anthropic to a claimed $40B+ run-rate.
▸But ~33% gross margin, ~$11-13.5B H1 2025 loss, and ~$115B projected cumulative burn through 2029 (Fortune, Nov 2025).
▸The '<$50B vs >$1T' framing compares one year of narrow revenue to multi-year cumulative capex.
▸It excludes AI-driven cloud uplift — AWS +24%, Azure +39%, Google Cloud +48% — where much AI revenue actually lands.

▸ section4.log[CLASSIFIED]

THE_ACCOUNTING_QUESTION

When the bear thesis lost "there's no revenue," it migrated somewhere far more sophisticated: the depreciation schedule. This is the argument worth taking seriously, and it is the one most readers have never heard articulated cleanly. Here is the mechanism. Hyperscalers buy enormous fleets of GPUs and spread that cost across their useful life on the income statement. Many have extended that assumed life from roughly 3 years to 5-6 years. If a GPU's true economic life — the span over which it is competitive and rentable — is closer to 2-3 years, then stretching depreciation over 5-6 years parks too little expense in each period, which makes reported profit look bigger than the economics justify. Michael Burry's Scion argued this could understate depreciation by about $176B across 2026-28, potentially overstating some firms' operating income by more than 20% (Motley Fool, Nov 2025). The honest part: this is contested, and it cuts both ways. GPUs frequently stay rentable for years after a newer chip ships — an A100 is not worthless the day an H100 lands; it simply moves down the price curve. And the firms themselves disagree under identical technology: in 2025, Amazon shortened its server depreciation while Meta extended its, looking at the same hardware. If useful life were obvious, they would not be diverging. What makes this the most credible bear thesis is that it doesn't require demand to collapse or revenue to be fake. It only requires that reported earnings — the very numbers the bulls lean on (see the next section) — be softer than they appear. The deflation engine and the depreciation question are linked: the faster capability deflates, the faster last year's GPU loses pricing power, and the more aggressive a 5-6 year schedule starts to look.

▸Many hyperscalers extended GPU depreciation from ~3 to ~5-6 years vs a possible ~2-3 year economic life.
▸Burry/Scion: ~$176B of understated depreciation across 2026-28, potentially overstating some operating income by >20% (Nov 2025).
▸Contested and two-sided: older GPUs often stay rentable for years; the thesis depends on contestable useful-life assumptions.
▸Tell-tale divergence: in 2025 Amazon shortened while Meta extended depreciation under identical technology.

▸ section5.log[CLASSIFIED]

NOT_2000

Every market panic reaches for the dotcom analogy, and on the single dimension that matters most it does not hold: in 2000 the leaders didn't make money, and today they do. Nvidia trades around 47x earnings on roughly $216B of FY2026 revenue (+65% YoY) and about $120B of net income. Cisco, the Nvidia of its day, traded near 200x at the 2000 peak. That is not a cosmetic difference; a 47x multiple backed by $120B of real net income is a different animal from a 200x multiple backed by hope. The attribution math reinforces it. The top-5 tech names generate roughly $350B of combined free cash flow. And about 78% of the tech sector's 2021-2025 return came from earnings growth, with only ~9% from multiple expansion (Roundhill) — the near-inverse of 1995-99, when re-rating did the heavy lifting. Worth flagging the source: Roundhill is an AI-ETF issuer, so read with that lens — but the underlying attribution checks out, which is why it belongs in an honest brief rather than a cheerleading one. This is the bulls' best point, and it earns a counterweight. The earnings-quality case rests entirely on reported earnings being accurate — exactly what the depreciation question puts in play. And the funding structure has grown fragile in a distinctly un-2000 way: vendor financing is circular (Nvidia into OpenAI, prepurchases into CoreWeave, the OpenAI-Oracle web), the buildout is increasingly debt-funded (~$108B raised in 2025), and credit stress is showing — Oracle's 5-year CDS hit a record ~198bps, with the Bank of England flagging widening spreads as the channel through which an AI repricing spills into broader debt markets. Not 2000. But not nothing, either.

▸Nvidia ~47x P/E on ~$216B revenue (+65% YoY) and ~$120B net income, vs Cisco's ~200x at the 2000 peak.
▸Top-5 tech ~$350B combined FCF; ~78% of 2021-25 tech return from earnings, only ~9% from multiple expansion (Roundhill, an AI-ETF issuer).
▸Earnings-quality case is hostage to the depreciation question — it assumes reported profits are accurate.
▸New-style fragility: circular vendor financing, ~$108B debt raised in 2025, Oracle 5-yr CDS at a record ~198bps (BoE flagged).

▸ section6.log[CLASSIFIED]

WHO_MAINTAINS_THE_CODE

The quietest risk in this whole story doesn't live on a balance sheet. It lives in the codebases AI is now writing at scale, and in the thinning pipeline of humans who will have to maintain them. The right mental model, borrowed from DORA's 2025 framing, is that AI is an amplifier: it raises delivery throughput while still correlating negatively with stability. It makes good teams faster and shaky teams shakier — it does not make the underlying engineering judgment for you. The evidence is uncomfortably specific. Veracode found that about 45% of AI-generated code ships with security flaws, and — this is the part that should end the "just wait for a bigger model" reflex — it does not improve with model size (Jul 2025). Bigger models write more impressive code that is just as likely to be insecure. METR's randomized trial found experienced open-source developers were 19% slower using early-2025 AI tools while believing afterward they had been 20% faster — a roughly 39-point perception-reality gap that should make you discount every self-reported productivity survey propping up the hype. Then the human layer. Anthropic's own RCT found juniors who leaned on AI scored 17% lower on comprehension (Jan 2026) — the tool quietly substitutes for the learning that turns a junior into someone who can debug it at 2am. And US developer employment for 22-25-year-olds is down about 20%. Part of that is hiring economics, not just AI. But the net picture is a world accumulating machine-written code faster than it is growing the engineers who understand it. The bill for that doesn't arrive next quarter. It arrives in three years, as a maintenance liability nobody depreciated.

▸DORA 2025: AI raises throughput but still correlates negatively with stability — an amplifier, not a fix.
▸~45% of AI-generated code ships insecure, and it does NOT improve with model size (Veracode, Jul 2025).
▸METR RCT: experienced devs 19% slower while feeling 20% faster — a ~39-point perception gap undercutting productivity surveys.
▸Anthropic RCT: juniors -17% comprehension (Jan 2026); US 22-25yo dev employment down ~20% — the future maintainers are thinning.

▸ section7.log[CLASSIFIED]

THE_DARK_FIBER_LESSON

If you want the right historical analogy, skip the dotcom comparison and go down a layer — to the fiber-optic cable buried under it. In the late 1990s, telecoms convinced themselves the internet would need near-infinite bandwidth and laid staggering quantities of fiber. By 2005, roughly 85% of that fiber was still unused — "dark," lit by nothing. The overbuild was so severe it helped bankrupt the companies that did the building. And here is the twist the doomers and the boosters both miss: the capacity was eventually fully used, and that glut of cheap fiber is part of what made the broadband internet, streaming, and the cloud economically possible. The builders went broke. The technology was transformative. Both things are true, and they are not in tension — because they answer different questions. That is the lesson to hold onto. A bubble popping is a statement about capital and timing: who financed the buildout, at what multiples, with how much leverage, and whether they survive the gap between spending and payoff. Whether the technology is transformative is a separate statement entirely. Fiber proved you can lose every dollar and still be directionally right about the future. Map it onto AI and the symmetry is clean. The deflation engine (Section 1) is the case that the underlying capability is real and getting radically cheaper — the fiber will get lit. The depreciation question, the circular financing, and the ~$115B burn are the case that some specific builders may not survive to see it. You can believe both. In fact, the dark-fiber precedent suggests you probably should. "Will AI matter?" and "will these valuations hold?" are different bets, and conflating them is how people end up wrong twice.

▸1990s telecoms overbuilt fiber so severely that ~85% sat unused ("dark") as late as 2005.
▸The overbuild bankrupted many builders — a genuine financial bust by any definition.
▸Yet that cheap, idle capacity later enabled broadband, streaming, and the cloud — the tech proved transformative.
▸Lesson: 'the bubble pops' and 'the technology matters' are separate questions; the deflation engine and the burn can both be true.

▸ section8.log[CLASSIFIED]

HOW_TO_READ_THE_INDEX

The Bubble Index attached to this piece defaults to roughly 62 out of 100 — and the most important thing to understand is what that number is and is not. It is a weighted read of financial pressure: valuation extension (S&P 500 CAPE near 40 versus a long-run average around 17.6), index concentration, the capex-versus-revenue gap, credit stress, depreciation risk, and so on. It is not a measure of whether AI works. A reading of 62 says "meaningful financial-bubble pressure, not a no-fundamentals mania" — which is precisely the honest middle this whole analysis lands on. Note the two indicators that point the other way, because they are the discipline that keeps 62 from drifting toward 90. Capability-deflation scores low on the bubble scale (the deflation engine is the single healthiest signal in the dataset). And earnings quality scores low too — Nvidia's real $120B of net income and the top-5's ~$350B of free cash flow are genuine ballast. Both are caveated: deflation may be partly loss-subsidized, and earnings quality rests on depreciation assumptions the bears dispute. The counter-signals come with their own counter-signals. That is the point. Which is why the Index is interactive. You can type in your own score per indicator and re-weight each one — if you think the depreciation question is the whole game, crank it; if you think enterprise ROI is a measurement artifact, dial it down. The composite will move accordingly. Treat the default 62 as a starting hypothesis, not a verdict. The number is an argument, not an oracle. Its entire job is to force the question the headlines dodge: which cost curve are you actually looking at, and which fragility do you actually believe? Answer those two honestly and you will be better calibrated than almost everyone shouting about the bubble — in either direction.

▸Default reading ~62/100: meaningful financial-bubble pressure, not a no-fundamentals mania.
▸It measures financial pressure (valuation, concentration, credit, depreciation) — not whether AI is useful.
▸Two counter-signals pull it down: capability-deflation and leaders' earnings quality — each with its own caveat.
▸Interactive by design: set your own score and weight per indicator. The number is an argument, not an oracle.

▸ related_research.ref[CLASSIFIED]

RELATED_RESEARCH

▸ Constraint-Driven Graph Systems for Future Thinking ▸ Systemic Venture Architecture and Flywheel Dynamics

>cd ~/archive