Cloud FinOps Wasn’t Built for AI Costs. The Tokenomics Foundation Is.

crowd of people sitting on chairs inside room

J.R. Storment stood in front of 2,500 FinOps practitioners in San Diego on June 9 and said something that would have been heresy two years ago: the discipline he helped create isn’t enough anymore. “Tokens are the atomic unit of AI,” he told the FinOps X 2026 keynote audience, “and the frameworks we built for cloud don’t govern them.”

Then he announced the fix. The Linux Foundation is launching the Tokenomics Foundation, a new standards body dedicated to open benchmarks, specifications, and best practices for AI cost management. Accenture, Booking.com, Google Cloud, IBM, JPMorgan Chase, Microsoft, Oracle, Salesforce, SAP, and ServiceNow have all signed on as founding supporters.

This is the most significant structural change to the FinOps ecosystem since the FinOps Foundation itself joined the Linux Foundation in 2020. And it arrived not a moment too soon.

The Great Token Panic

The urgency behind the Tokenomics Foundation has a name at FinOps X: the Great Token Panic.

TechCrunch reported that enterprises across industries are blowing through AI budgets at rates nobody forecasted. Uber exhausted its entire 2026 AI coding budget by April. One unnamed company accumulated a $500 million Claude bill after failing to set employee usage limits. Priceline watched one engineer spend $40,000 on tokens in a single month and saw its Cursor contract renewal costs jump four to five times.

Per-token prices fell steadily from 2023 through late 2025. Then they plateaued. As nOps noted in its FinOps X recap, token prices hit a floor in November 2025 and haven’t budged since. Organizations that built their 2026 budgets assuming continued deflation got blindsided.

The consumption side is worse. Jellyfish’s data shows per-developer token consumption rose 18.6x in nine months. Goldman Sachs projects global token usage will grow 24x by 2030, reaching 120 quadrillion tokens monthly. The industry went from “go fast and tokenmax” to “we need guardrails” in under a year.

Why Cloud FinOps Tools Can’t Solve This

Pooja Kumar from Prudential Financial put it bluntly at the keynote: “Traditional FinOps is dead.”

That’s deliberate provocation, but the substance is real. Cloud FinOps tracks resource usage against capacity: compute hours, storage gigabytes, network transfers. AI costs operate on a different axis entirely.

Gartner’s Marco Meinardi explained at FinOps X that AI cost drivers are fundamentally external. “Customers and how they use our AI application, how they prompt them, is going to influence our costs,” he said. A single customer changing prompt patterns can swing a company’s inference bill by thousands of dollars in a day. No cloud cost tool is instrumented for that.

Accenture’s Grant Byrum described the forecasting problem this way: “In AI, costs are tied to how the work is being done and not physical resources.” Historical consumption patterns, the backbone of cloud cost forecasting, are nearly useless for AI workloads where a prompt change or model swap rewrites the cost structure overnight.

Nishant Gupta, Salesforce’s Chief Availability Officer, framed the visibility gap in starker terms: “Tokens [are] an abstract quantity. It’s very hard to relate tokens to a business outcome.” Most enterprises only discover budget overruns when the monthly invoice arrives, 30 days too late.

The Nine Cost Layers Most Teams Can’t See

The most actionable framework to come out of FinOps X is the nine layers of AI cost. Most FinOps teams have visibility into layer one and almost nothing else.

The full stack:

  1. Token consumption. What most dashboards track. Input and output tokens per API call.
  2. Retrieval and data. Vector database queries, embedding generation, RAG pipeline storage and compute. These costs scale with knowledge base size, not token volume.
  3. Orchestration. Agent frameworks, routing logic, retry handling. Every agent chain multiplies the base inference cost by the number of steps in the chain.
  4. Inference infrastructure. GPU instance costs, whether on a hyperscaler or neocloud. Includes idle capacity, autoscaling overhead, and commitment premiums.
  5. KV cache. The memory used to maintain conversation context. Long conversations and large context windows drive this cost independent of output tokens.
  6. Evaluation and monitoring. Model performance testing, output quality scoring, A/B testing infrastructure. Often overlooked because it’s treated as engineering overhead rather than AI cost.
  7. Governance. Access controls, policy enforcement, audit logging, compliance tooling. Regulated industries carry significantly higher governance costs.
  8. Human labor. Prompt engineering, fine tuning, data curation, model evaluation. The labor cost of making AI work is often larger than the infrastructure cost.
  9. Failure and waste. Rework caused by hallucinations, abandoned experiments, duplicate inference calls, capacity sitting idle between spikes.

In a fractional COO engagement last year, the client’s tracked AI costs (token consumption and GPU instances, layers one and four) accounted for about 62% of actual total AI spending. The other seven layers made up the rest. That lines up with earlier analysis on this site showing AI tokens cost roughly 40% more than their sticker price once the surrounding infrastructure is included.

What the Tokenomics Foundation Will Actually Build

The foundation’s technical committee will focus on three deliverables.

Open specifications for token cost measurement. The goal is extending the FOCUS specification, already the standard billing format for cloud providers, to cover AI token spending. This means a common schema for reporting token costs across OpenAI, Anthropic, Google, and self-hosted models.

Benchmarks for token economics. Vendor-neutral baselines for cost per inference, cost per agent action, and cost per business outcome across model types and deployment patterns. Without these, enterprises have no way to evaluate whether their AI spending is efficient or just expensive.

Certification programs. A new credential track for AI FinOps, distinct from the existing FinOps Foundation certification. The skills required to manage token spend are different enough from cloud spend that the Foundation believes a separate qualification is warranted.

The founding supporter list is notably broad. Having Microsoft, Google Cloud, Oracle, and IBM on the same standards body, alongside enterprises like JPMorgan Chase and Booking.com, gives the foundation credibility that a vendor-led initiative would lack.

What This Means for FinOps Practitioners

Three immediate implications.

Your scope just expanded. If you manage cloud costs today, AI token costs are landing on your desk whether you asked for them or not. The State of FinOps 2026 report already showed AI as the top forward-looking priority for FinOps teams. The Tokenomics Foundation formalizes that expansion.

Instrumentation becomes the job. Gartner’s Meinardi was direct: organizations must “build metadata into applications using instrumentation and telemetry” to connect AI expenses with business results. This is a fundamentally different skill from tagging cloud resources. It requires partnership with engineering teams building AI features, not just operations teams running infrastructure.

Forecasting models need rebuilding. Accenture’s Byrum advocates for “use case forecasting,” which projects costs based on planned user counts, interaction volumes, and deployment releases rather than historical consumption. This is closer to product management than financial analysis. FinOps practitioners who can bridge that gap will be the ones who keep their organizations’ AI budgets intact.

Storment himself signaled where the field is headed: FinOps X will evolve into the broader “Tokenomicon” conference in 2027. The message is clear. Token economics isn’t a subset of cloud FinOps. It’s becoming the main event.

ty247

Ty Sutherland is the Chief Editor at Kost Kompass. With 25 years of experience in enterprise strategy and financial management, Ty Sutherland is the driving force behind kostkompass.com. Specializing in helping Finance and Technology Managers optimize costs in servers, cloud, and SaaS, Ty combines technical acumen with financial discipline to deliver actionable insights for cost-effective solutions.

Recent Posts