AI’s free lunch ends as token costs bite

July 2026
M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31
« Jun		Aug »

Published on the 18/06/2026 | Written by Heather Wright

Linux Foundation targets AI cost chaos…

“The AI free lunch is starting to look not so free anymore,” says Arun Chandrasekaran.

That warning from the Gartner distinguished vice president, analyst, encapsulates a turning point for enterprise AI, and comes as the Linux Foundation moves to tackle AI token cost management and address rising AI costs.

“Get clear and understand those token costs and alternative methods organisations have to remedy the ills of the token cost burden.”

The Tokenomics Foundation, launched by the Linux Foundation this month, will focus on establishing open industry standards, benchmarks and best practices for the economics of AI infrastructure. Its launch comes as enterprises grapple with rising costs and uncertain returns from generative AI and agentic AI.

At the centre of that shift is a fundamental change in how AI is priced – and how organisations experience its costs.

Unlike traditional enterprise software, which is typically priced per user or per licence, AI is increasingly consumed on a usage basis. Costs are increasingly tied to how much compute is used, measured through tokens – the basic, fundamental unit of data that LLMs process and generate. The more complex a task, the more tokens it requires.

And while per-token costs fell heavily during 2023-2025, the Linux Foundation says they have now levelled off and new model token prices are rising, making AI the largest and fastest-growing line item on enterprise technology budgets.

Dr Amanda Williamson, Deloitte New Zealand AI Institute Director, told iStart AI is opening a whole new era of calculating costs and introducing a level of unpredictability unfamiliar to many enterprise buyers.

“What we’ve been observing around New Zealand is a lot of leaders have been rolling out AI tools, and AI is a tricky little beast because you roll it out and suddenly you get surprises with consumption costs around it,” she says.

“As usage continues, sometimes we have no idea how much it’s going to cost over the next six months or even the next six days because it’s based on usage as opposed to seats,” Williamson says.

This shift – commonly referred to as tokenomics – means organisations are no longer buying fixed capacity. Instead, they’re paying for every interaction, query and workflow processed by AI systems.

For some, that’s hurting. Uber’s CTO Praveen Neppalli Naga recently revealed that the company had exhausted its entire 2026 budget on AI coding tools in just four months, driven by ‘token-maxxing’, where engineers were encouraged to maximise usage. Another, unnamed, enterprise business reportedly accrued a US$500 million bill for Anthropic’s Claude AI in a single month after giving employees unrestricted access and uncapped usage of API tokens for agentic AI workflows.

The economics of AI are also being shaped by supply-side constraints.

Chandrasekaran notes that generative AI was a ‘consumerisation phenomenon’. Tools used in personal lives have moved into the enterprise. The companies making those tools have an incentive to continue to serve the consumer market, because it’s a big brand factor and a pull factor for them into the enterprise.

Vendors prioritised growth and adoption, offering relatively low-cost access in order to build user bases – and the all important data advantages: the more users, the more data to train models on, the better your model becomes

“We were getting access to AI often at a subsidised rate, and we were often able to just purchase it with a per seat approach,” Williamson says.

That, however, was the old world. Williamson notes a few weeks ago, things started to shift.

“There are a few factors that are making it really hard for them to continue to subsidise AI,” Chandrasekaran says.

Alongside the token maxing trend, he points to the rise of AI agents. Unlike AI chatbots, which are more asynchronous workflows, AI agents have more autonomy and are able to make sequential or parallel requests to AI models running in the back end – leading to AI models being hit with more requests from AI agents. Goldman Sachs has estimated agentic AI could see token use increase by over 24 times by 2030.

A scarcity of compute, with construction of many of the data centres planned to support the AI boom now delayed, has also driven the shift from surplus to scarcity, with direct implications for pricing.

Organisations that previously relied on predictable, low-cost access to AI are now facing the prospect of rising and variable costs tied to usage patterns, as vendors GitHub with AI Copilot, start transitioning to a token-based – or consumption – pricing and Anthropic and OpenAI move to pay-as-you-go models, charging business users for compute resources.

GitHub’s move prompted plenty of Reddit discussion and angst, with one user noting their bill would be going from US$25/month to US$750/month.

“Suddenly enterprises are worried because they don’t know what their downside is with the token-based pricing model because they don’t have tools, they don’t have instrumentation to really measure that token usage within the organisation,” Chandrasekaran says. “So there is a lot of worry, and rightfully so, around what the changing consumption pattern, the change in pricing model really means for long-term TCO of AI within the enterprise.”

Adds Williamson: “AI’s value is hitting the cost line… not the revenue line.”

While AI-driven productivity gains are widely acknowledged, many organisations are still struggling to quantify those gains in financial terms – particularly when it comes to revenue growth.

“Everyone’s using AI and feeling more productive, but very few can yet measure AI lifting revenue,” she adds.

As a result, AI is often being assessed through a cost lens, rather than as a driver of top-line growth.

At the same time, token-based pricing means those costs are becoming more visible—and, in many cases, harder to control.

For CFOs and technology leaders, this is creating a new area of focus: Managing token consumption.

Token usage can increase rapidly as organisations adopt more advanced AI capabilities, particularly those involving multiple chained interactions or automated workflows.

Even relatively simple use cases can drive higher-than-expected costs if usage scales across teams or business units, making cost visibility and measurement critical.

Engineering for efficiency

As token costs rise, organisations are being pushed to adopt more disciplined approaches to AI deployment.

One key lever is efficiency in model selection and system design.

“You don’t always need to use the best, most token-hungry AI models,” Williamson notes, highlighting the importance of matching model capability to use case.

“This is a great moment to start thinking about what models we’re using and when.”

Until now, most organisations’ approach has been to switch on the model provided in whatever tech stack they have access to, but Williamson notes use of openweight models – effectively opensource models you can get without paying normal fees – changes the cost structure, though they do require having hardware.

“There are certain use cases where it might actually make sense to switch out the model and switch out the approach and get the CFOs and engineers together to really come up with a smart plan for how to do AI efficiently,” she adds.

Optimising workflows – reducing unnecessary queries and improving system design – can have a significant impact on token usage.

Ultimately, tokenomics represents a shift in how organisations think about AI – from a relatively predictable tech investment to a dynamic, consumption-driven service, requiring a new level of financial discipline.

“Get clear and understand those token costs, how the world of token costs is shifting and alternative methods organisations have up their sleeve to be able to remedy the ills of the token cost burden,” Williamson says.

“Learn about token costs of AI and how to do it efficiently. There’s always going to be a cost-benefit analysis to decide to go forward or not with AI, and if the costs cannot be managed, then it’s going to be a non-starter.”

For many organisations, that means building closer alignment between finance and technology teams, improving measurement frameworks, and developing a clearer understanding of how AI usage translates into business value.

Standards emerge as costs rise

The Linux Foundation’s move to establish the Tokenomics Foundation reflects the growing importance of these issues.

As enterprises navigate shifting pricing models, rising infrastructure costs and evolving usage patterns, the need for standardisation and best practice is becoming more acute.

By focusing on benchmarks and economic frameworks, the initiative aims to bring greater transparency and consistency to how AI costs are measured and managed.

“Measuring and benchmarking token efficiency across different models and vendors is critical to how organisations make business decisions, but until now, there was no neutral home to develop the standards needed to measure token economics transparently across the entire supply chain,” says Jim Zemlin, Linux Foundation CEO. “The Tokenomics Foundation provides that neutral home, ensuring these standards remain open and community-driven.”

For enterprises, that could help reduce uncertainty—and provide a clearer path to scaling AI sustainably.