Uber exhausted its entire 2026 artificial intelligence budget by April, just four months into the calendar year. This happened after Anthropic's Claude Code spread across roughly 5,000 engineers faster than the company's finance models had anticipated.
Chief Technology Officer Praveen Neppalli Naga confirmed the overrun to The Information. He said the company was back to the drawing board on its assumptions. Uber's total research and development spend reached $3.4 billion in 2025, up 9 percent year over year. This makes the budget collapse less about scale and more about a pricing model that enterprise finance teams haven't learned how to manage.
The disclosure landed alongside a structural shift from Anthropic itself. On May 13, the company announced that paid Claude subscribers would soon face a separate monthly credit meter for agent tools and third-party harnesses. This will be billed at full application programming interface rates starting June 15.
The two events describe a single problem. Token-based consumption pricing doesn't behave like the software line items chief financial officers know how to model. The gap between what engineers consume and what finance teams expect isn't hypothetical. It's a real issue that's causing problems for companies like Uber.
Uber rolled out Claude Code to its engineering organization in December 2025. Adoption climbed from 32 percent of engineers in February to 84 percent classified as agentic coding users by March. By spring, 95 percent of Uber engineers used artificial intelligence tools monthly, and roughly 70 percent of committed code originated from those tools. They're using these tools for a variety of tasks. About 11 percent of live backend updates were written by agents with no human in the loop, according to Uber's own disclosures.
The numbers behind the spend are what make the story instructive rather than anecdotal. Monthly cost per engineer ranged from $150 to $250 on average, with power users running between $500 and $2,000. Naga himself reported spending $1,200 in a two-hour session during a personal demo. The tool didn't fail, and engineers didn't misuse it. They used it for exactly the workloads it was designed to handle: parallel agent execution, large-scale codebase refactoring, automated test generation, and backend code production.
From a productivity standpoint, the rollout was a success. It's helped engineers get more work done. From a finance standpoint, it was a runaway. Uber compounded the dynamic by ranking engineers on internal leaderboards based on Claude Code usage. That created a cultural incentive to consume more tokens, which translated directly into faster budget burn. The teams driving adoption weren't the same teams managing the spend, and that organizational gap turned out to be the load-bearing flaw.
Claude Code doesn't price on a per-seat basis. It meters tokens consumed across model calls, which means an engineer running autocomplete suggestions consumes a fraction of what an engineer orchestrating parallel agents across a monorepo will consume. This pricing model gives the vendor unlimited upside on heavy users and gives finance teams almost no forward visibility.
Microsoft has taken the opposite approach with Microsoft 365 Copilot Enterprise, which sells at $30 per user per month with an annual commitment. The price caps the vendor's upside and gives finance teams a flat line item they can multiply by headcount. Anthropic's consumption model is different. It's based on token consumption, and it can be difficult for finance teams to predict.
Both models are defensible, and neither is right for every workload. But treating them as interchangeable in a planning cycle is what produced Uber's outcome. It's a lesson for other companies to learn from.
GitHub is moving Copilot to a credit-based system on June 1, and analysts cited by InfoWorld expect most vendors to introduce separate consumption pools for agents and tool use over the next 12 to 24 months. This change will likely affect how companies budget for artificial intelligence tools.
The industry's standard response to consumption-cost stories is that artificial intelligence pays for itself in productivity gains. Uber's case complicates that argument. The marginal productivity gain from a senior engineer running agentic workflows has to clear a much higher token-cost hurdle than the gain from an engineer running autocomplete. It's not always clear whether the benefits outweigh the costs.
Five-to-twenty-fold increases in per-developer consumption are now documented in agentic mode, and no public benchmark shows a matching multiplier on output value. Productivity savings also don't show up in the same line item as artificial intelligence cost, which means finance teams can't net them out inside a quarterly review. They're separate line items, and that makes it difficult to compare them.
There are also operational limits that make the simple cost-versus-output framing incomplete. Only 43 percent of organizations have formal artificial intelligence governance policies, according to survey data cited in coverage of the Uber overrun, and only 21 percent have mature agentic governance. Most enterprises don't yet apply to artificial intelligence tooling the spending controls that DevOps teams routinely apply to cloud compute.
That includes per-engineer caps, real-time monitoring of token consumption, and budgetary alerts before overrun rather than after. Uber deployed Claude Code organization-wide without those controls, and the result was visible within a quarter. The company didn't have the right controls in place, and it paid the price.
The Uber experience produces a short list of practical implications for finance leaders watching their own engineering organizations adopt agentic coding tools. The first is that pilot economics don't predict scale economics for consumption-priced tools, because pilots run on a few engineers using autocomplete while production runs on whole teams delegating multistep workflows to agents. It's a different story when you scale up.
The second is that incentive structures matter as much as pricing. Leaderboards and adoption targets drive token consumption, and any rollout that rewards usage without capping it should be modeled as an unbounded liability until proven otherwise. You can't just let engineers use as much as they want without controlling costs.
The third is the structural one. Anthropic's June 15 credit-pool change signals that subsidized programmatic usage on subscription plans is ending across the industry. Enterprises that built their forecasts on flat-rate Claude Code economics will see their effective unit costs rise, and the same logic will apply to other vendors as they follow Anthropic's lead.
Procurement teams that want predictability will need to negotiate committed-spend agreements at fixed rates rather than ride consumption pricing, and the leverage they bring to those conversations will depend on whether their engineering organizations have any usage caps in place at all. They need to be prepared to negotiate.
Uber isn't slowing its artificial intelligence push. Naga plans to test OpenAI's Codex alongside Claude Code, and the long-term vision he describes is one where agent engineers handle coding, testing, and deployment with humans acting as orchestrators. That's the future of software development, according to Naga.
That direction is consistent across major engineering organizations now adopting these tools. The open question for boards is not whether to deploy them but whether finance functions have any visibility into what they will cost when the engineers stop holding back. It's a question of cost and control.
The gap between what engineers consume and what finance teams expect isn't hypothetical. It's a real issue that companies need to address.
Key Facts:
- Uber spent its entire 2026 AI budget in four months
- 5,000 engineers used Anthropic's Claude Code
- Monthly cost per engineer ranged from $150 to $250 on average
- Power users spent between $500 and $2,000
- 95 percent of Uber engineers used AI tools monthly by spring
- 11 percent of live backend updates were written by agents with no human in the loop