Brief

The Future of Opex in the Agent Economy

AI’s real impact isn’t efficiency; it’s a fundamental rewrite of how decisions are made and work gets done.

By Jue Wang, Anne Hoecker, Ann Bosche, Tamara Lewis, and Peter Bowen

min read

}

Executive Summary

A widening experience gap is emerging in which a small group using AI extensively is pulling far ahead of the rest who are still experimenting superficially.
Organizational friction is now the primary constraint as legacy processes can’t keep pace with AI’s rapid evolution.
AI’s real impact isn’t just efficiency; it’s easing this friction by reducing the need for coordination and shifting decisions closer to the source.
A change in operating expenses from headcount to tokens is inevitable, but the path will be nonlinear, with cost overlap and organizational disruption.

The friction

The technology is, once again, outrunning every expectation. In recent months, conversations among tech-forward leaders we work with have shifted from optimism to vertigo: “Coding is a solved problem.” “I’ve met my match.” “What’s left for the human to do?”

But this isn’t a split between the chief technology officer (or chief AI officer) and everyone else. A small percentage of people at every level have had the visceral “this is it” moment. They’ve spent real hours with frontier models. They've watched an agent (or eight agents) plan, write, test, and deploy a feature set in two hours that would have taken a team three sprints. They've seen a finance leader prep for a board-level meeting in an afternoon rather than it taking weeks of cross-functional coordination. They feel the profound change. The rest (i.e., most) are still at “I tried Copilot a few months ago; it didn’t work well.” This isn’t a seniority gap or a training gap; it’s an experiential chasm, widening every week.

It's not just efficiency. Efficiency is the least interesting thing AI can do. The most interesting things that it does are letting you ship products faster, serving customers you couldn't reach, and capturing share in markets where speed is the constraint. When a senior engineer goes from idea to working prototype in a day or shows a customer that the change request discussed that morning is already implemented, that's a capability play, not a cost play. When a sales lead can research an account, prep competitive positioning, and analyze the full book of business for churn risk in an afternoon instead of coordinating across three teams (business line, sales, customer success) and waiting for the quarterly review, that's a new modality of go-to-market operations. When a corporate development director can model five acquisition scenarios in a day instead of assigning it to a team for three weeks, that's a fundamentally different decision-making cycle.

The real unlock is shift left: decisions made closer to the source, the dilution from every handoff eliminated. Most of what slows organizations down isn't the work itself; it's the coordination overhead that existed because humans needed it. AI makes that coordination irrelevant. As the need for coordination declines, the requirements for team unit and structure, operating model, decision-making processes, and talent also change.

And here’s what makes this chasm consequential: The people who’ve had the “this is it” moment are increasingly frustrated by everything around them. Every sprint planning meeting, every 14-person deal review, every “let me escalate that to a specialist,” every “can you review my drafts,” every “I need a five-person team for this supply chain project” feels like friction from a previous era. The tension between individual revelation and institutional inertia is the defining undercurrent of enterprise AI in 2026.

Here’s the thing the skeptics keep getting wrong: You don’t need to invest months in becoming a great prompt engineer. AI meets you where you are. Claude locks into your way of thinking in a single conversation. It takes less than a week to get productive in Claude Code CLI or Cursor or Glean. The onramp isn’t the bottleneck. The bottleneck is that most people haven’t stepped onto it.

However, even for the willing, the enterprise deployment reality is brutal. Every new data source needs a new approval, and you’re not even sure who’s in charge. The CRM/ERP connector requires specifying every object and field. Nothing works with that network whitelist/dns. The pilot was stellar, but where’s the $2 million in unbudgeted funding? The vendor docs are two releases behind. Guess what, we have no product management team to run, maintain, and inspect real agents. None of your governance, IT, or procurement processes were designed for this speed.

The technology moves at the speed of model releases and weekly harnessing improvements. The organization moves at the speed of IT/procurement cycles. Something has to give.

The provocation

This tension has a financial destination. Many technology leaders we work with have started to picture a future operating expenses (opex) mix of 70% to 80% headcount and 20% to 30% token cost. Some version of that image is forming in the minds of every CXO who’s paying attention.

visualization — The cost of agents, tokens, and data could replace 20% to 30% of today’s headcount operating expenses

The question isn’t whether that’s directionally right; it’s that there is no glide path to get there. This isn’t a smooth migration; it’s a wildly nonlinear path, full of cost overlaps, organizational whiplash, and quarters where the math doesn’t work in an obvious way. Even at 1% to 2% of opex, the token budget problem is real. If you have 10,000 developers, that's a number that makes a general manager choke midquarter. It's unbudgeted. It doesn't fit an existing line item. (We unpack the economics in detail in part II.) But the structural implications don’t wait for the math to settle.

Early data suggests the top 5% or so of users consume more tokens than the other 95% combined. The cost is sometimes concentrated in exactly the people you can’t afford to throttle.

Thought starters for technology leaders

If the opex mix is shifting from headcount to tokens, it forces questions most leadership teams aren’t asking yet (see Figure 2). These aren’t implementation questions. They’re structural.

Before anything else, create the moment

The fastest way to close the experiential chasm is to stop talking about AI and start using it. Pick three to five real business problems, not toy demos. Real deliverables include a competitive analysis, a customer segmentation, a product spec, a sales playbook, and a pricing model. Give each to one senior person with a state-of-the-art AI tool and two days. Maybe two out of five produce a genuine “this is it” moment. That's enough. One could be a fluke. Three to five gives you a pattern. What comes back will be the most persuasive business case you’ll ever build, because it won’t be a projection. It’ll be proof.

Next Monday: Identify the problems. Identify the people. Clear their calendars. That's it. Everything else in this piece will land differently once your leadership team has had their own “this is it” moment.

What work actually remains for the human?

Start with what agents can't do: Formulate the problem when the problem itself is ambiguous. Verify that the output is contextually sound, not just syntactically correct. Put their name on it. Decide what's worth building. Sell a board, navigate a customer org, build coalition. Everything else is moving to agents.

Next Monday: Pick two large functions, not just engineering. Map every role to problem formulation, verification, accountability, direction setting, or execution. This isn't a quick exercise; it takes real work to be honest about how much headcount sits in execution. But that number is your exposure, and you need it.

Does the apprenticeship model survive? (Yes, but accelerated)

This is the question that keeps chief human resources officers and heads of learning and development (L&D) up at night, and most are framing it wrong. The fear is: “If agents do the execution, juniors never build the muscle.” But that assumes learning requires doing. It doesn’t; it requires engagement and repetition.

Medical residents don’t learn by performing surgery unsupervised. They learn by watching, assisting, and then verifying, with a senior surgeon ready to intervene. The agent economy creates the same structure at scale: juniors learn by reviewing, stress-testing, and catching the errors in AI-generated output. The reps per hour go up, not down. A junior analyst who reviews 30 agent-drafted models in a week builds faster instinct than one who drafts three themselves. The inversion isn’t a collapse; it’s an acceleration if you design for it.

Next Monday: Ask your L&D and functional leads: What does learning by verifying look like in practice? Design three pilot rotations in which junior talent is trained on reviewing and stress-testing agent output, not producing from scratch. Measure ramp time against your traditional apprenticeship. We think you’ll be surprised. Juniors will become senior-level much faster than we’re accustomed to seeing.

How do you run a dual operating model without tearing the org apart?

Your best people are already saying: “I don’t need a team; they’re slowing me down.” But you can’t flip 10,000 people into 3-person AI-native pods overnight. The transition means running both models in parallel: existing structure for steady-state, AI-native pods for new initiatives. The pods will ship 5 to 10 times faster. That creates internal political tension most orgs aren’t ready for.

Next Monday: Identify two to three greenfield projects across different functions. Staff each with an AI-native pod: one senior domain expert, two product minds/verifiers. Give them 30 to 60 days and a real business outcome. Protect them from legacy process. Measure velocity against your current structure. Let the data make the case. What you'll find is that the conversation shifts from "how do we do things faster" to "how do we solve the big problems for our customers that were always too expensive to tackle."

The conversation shifts from “how do we do things faster” to “how do we solve the big problems for our customers that were always too expensive to tackle.”

What does your talent pipeline look like when teams are 3 people, not 15?

You’re no longer hiring people to do work; you’re hiring people to direct and verify work. That’s a fundamentally different profile: more senior, more opinionated, more vertically integrated. The hiring bar goes up. The junior funnel narrows. And the implications extend well beyond any single company.

Next Monday: Take one team that's 15 today and will be 5 in two years. Write the job descriptions for those 5 people. What you'll find is that every role requires more judgment, more domain instinct, and more customer proximity than what you're hiring for today. That's your new talent bar.

Is your software stack built for human buyers or agent buyers?

For each of your top software-as-a-service (SaaS) contracts, ask one question: Does a human actually need to look at the user interface, or could an agent with API access do the job faster with no dashboard at all? The SaaS model was built on selling seats to humans. Agents don't need seats. They need APIs and usage-based pricing. If headcount shrinks and work migrates to agents, every seat-based contract becomes a legacy tax. The vendors who adapt by exposing robust APIs and shifting to consumption pricing will capture the next wave. This isn’t theoretical. We’re already seeing enterprises renegotiate CRM and IT service management contracts on precisely this basis.

Next Monday: Audit your top 10 SaaS contracts by spending. For each, does the vendor offer a robust API? Is pricing seat-based or usage-based? Could an agent replace the human workflow that justifies the license? Flag every contract where the answer is “yes, within 18 months.” That’s your renegotiation list.

Coming in part II: Token economics. We unpack the economics of AI tokens and explain why 1% to 2% of headcount cost is already choking budgets, why effective cost per task is stubbornly flat despite falling model prices, and the practical moves to navigate the wildly nonlinear transition from headcount to tokens.

Authors

Jue Wang

Partner, Silicon Valley
Anne Hoecker

Partner, Silicon Valley
Ann Bosche

Partner, San Francisco
Tamara Lewis

Partner, Seattle
Peter Bowen

Partner, Austin

First published in maio 2026

Offices

North & Latin America

Europe & Africa

Asia & Australia

Select your region and language

Global

North & Latin America

Europe, Middle East, & Africa

Asia & Australia

You have no saved items.

North & Latin America

Europe & Africa

Asia & Australia

Global

North & Latin America

Europe, Middle East, & Africa

Asia & Australia

Popular Searches

Your Previous Searches

Recently Visited Pages

Executive Summary

The friction

The provocation

Thought starters for technology leaders

Before anything else, create the moment

What work actually remains for the human?

Does the apprenticeship model survive? (Yes, but accelerated)

How do you run a dual operating model without tearing the org apart?

What does your talent pipeline look like when teams are 3 people, not 15?

Is your software stack built for human buyers or agent buyers?

Tags

How We've Helped Clients

A Technology Company’s Transformation Restores Its Growth Trajectory

Helping a Midsize ERP Player Compete against the Giants

Aggressively growing an IT service provider with a high-performance culture

Quer saber mais?

How can we help you?