Is the billable hour dead because of AI?

For most US service agencies, the billable hour is dying as a sales mechanism, not as an internal cost-tracking tool. WPP, S4 Capital's Monks, FIG, and 72andSunny each moved toward output, subscription, or modular pricing in 2026, and a Deloitte survey cited by Consulting Success found 67% of consulting buyers now prefer fixed-fee over time-and-materials, up from 41% three years earlier.

What is value-based pricing for a marketing agency?

Value-based pricing is a model where the agency fee is anchored to the business outcome produced (qualified pipeline, revenue lifted, hires placed, conversion delivered) rather than hours worked. For a 30-person agency, the three practical implementations are a retainer of outcomes (monthly fee plus KPI accelerator), an output menu (fixed-price productized deliverables), and an AI subscription (annual bundle of access, capability, and output).

How do agencies charge for AI?

Agencies use four observed patterns for token and inference costs, per Digiday reporting: pass-through as a line item (Merge, Big Spaceship), absorb into the master fee (RPA, Anomaly), bulk-negotiate with model providers and pass through without markup (Pencil, Brandtech), or run a bulk-buying agreement with itemized client billing (Lerma). Strategy and human judgment are priced separately under one of the three commercial models above.

How do agencies transition from hourly billing to outcome-based pricing?

A 30-seat agency can run the migration in 12 weeks: weeks 1 to 3 audit utilization and segment the client portfolio, weeks 4 to 6 pick the right model per segment and draft contract scaffolds, weeks 7 to 9 pilot one model with two willing clients using shadow pricing, weeks 10 to 12 migrate the top tier and rewrite the sales script and compensation plan. The IPA Pricing Playbook five-axis test (risk appetite, flexibility, scope stability, commercial fluency, outcome measurability) determines which model fits each account.

How much should agencies charge when AI cuts delivery time by 60%?

When AI compresses billable delivery hours by 60%, hourly-billed revenue collapses by roughly the same amount. A 30-FTE US agency generating $7.35M at a $175 blended rate drops toward $2.94M on the same staffing if nothing else changes. The new pricing floor must clear roughly $4.4M of replacement revenue through outcome fees, fixed-price outputs, or subscription bundles.

What is a retainer of outcomes?

A retainer of outcomes is a monthly agency fee with a floor (covering team and tooling) plus a performance accelerator tied to a measurable KPI such as qualified leads, MQL-to-SQL rate, revenue, or retention. It replaces the hourly retainer by uncoupling fee from time and coupling it to result, while still giving the agency predictable cash flow.

Value-Based Pricing for AI Agencies: Repricing a 30-Person Shop in 2026

By Jorge Castro, Founder, Kreante. Published 18 May 2026.

TL;DR

A 30-seat US agency billing $175/hour at 1,400 billable hours per FTE generates approximately $7.35M per year. If AI compresses delivery time by 60%, hourly-billed revenue falls to roughly $2.94M on the same staffing. That $4.4M gap is what agency value-based pricing AI has to close.
The 2026 agency pricing playbook is three contract shapes, not one: a retainer of outcomes (KPI-tiered fee with floor plus accelerator), an output menu (fixed-price productized deliverables, FIG-style), and an AI subscription (S4 Capital Monks-style annual bundle with token pass-through). Each fits a different client segment.
Token costs are not the new billable hour. Four real pass-through patterns are emerging at named agencies per Digiday [source: https://digiday.com/marketing/agencies-grapple-with-economics-of-a-new-marketing-currency-the-ai-token/]: pass-through as a line item (Merge, Big Spaceship), fully absorbed (RPA, Anomaly), bulk-negotiated with no markup (Pencil, Brandtech), or bulk-bought with itemized billing (Lerma).
A 12-week migration is sufficient for a 30-seat shop: weeks 1 to 3 utilization audit, 4 to 6 portfolio segmentation, 7 to 9 shadow-priced pilots, 10 to 12 Tier 1 migration plus sales script and comp plan rewrite.
The IPA Pricing Playbook’s five-axis test (risk appetite, flexibility, scope stability, commercial fluency, outcome measurability) is the cleanest framework to decide which contract belongs on which account [source: https://ipa.co.uk/news/pricing-playbook].

Key Takeaways

The billable hour is dying as a sales mechanism, not as a cost-tracking tool. Agencies still track hours internally; they no longer sell them externally.
Three commercial structures replace the hour: retainer of outcomes, output menu, AI subscription.
Token economics are a separate decision from the commercial model. Pick a pass-through, absorb, bulk-negotiate, or bulk-buy pattern explicitly.
Senior judgment is now the priced asset. Cobbold’s IPA framing: “As machines absorb execution, human judgement becomes the scarcest asset” [source: https://creative.salon/articles/features/ipa-jason-cobbold-pricing-playbook].
The migration takes 12 weeks for a 30-FTE shop, not 12 months.

The $4.4M revenue cliff hiding in a 30-seat P&L

The revenue cliff is the gap between hourly-billed capacity and the new AI-compressed delivery reality. Here is the math most agency owners avoid running.

Thirty full-time billable people, 1,400 billable hours per year each (a defensible utilization assumption for senior-leaning US agencies, lower for production-heavy shops), at a $175 blended rate equals $7.35M of top-line capacity. That is the number on which the office lease, the owner draw, and the client services team are built.

Now compress the delivery side of those hours by 60% because Anthropic’s Claude wrote the first draft, Midjourney generated the campaign visuals, and an automated workflow handled the deliverable build. The naive math: 1,400 hours per FTE collapses to 560, and the same 30 people now invoice $2.94M. The agency must replace $4.4M of revenue without adding headcount, or the margin profile collapses.

This is not a thought experiment. MediaPost reports that “in the ‘golden age’ of advertising, agency profit margins hovered around 30%; today, the worldwide average is a mere 10%” [source: https://www.mediapost.com/publications/article/413193/billable-hours-are-dead-ai-killed-them-heres-ho.html]. Practitioners in the MediaPost comment thread dispute the 30% golden-age figure, putting it closer to 12% to 16% historically. The direction of travel, not the absolute starting point, is what matters. The same MediaPost article notes that “the average creative is churning out nearly five times the output for the same or less compensation than they did a decade ago” [source: https://www.mediapost.com/publications/article/413193/billable-hours-are-dead-ai-killed-them-heres-ho.html]. Output rose 5x. Margin fell two-thirds. The billable hour did that.

If you are a 30-person founder reading this, the question is not whether your model is breaking. The question is whether you reprice before or after the next renewal cycle.

Why the billable hour stopped working

The billable hour stopped working in 2026 because three independent forces converged on it at the same time.

First, the unit of value moved. Greg Castro, VP of Global Partnerships at Mobvista, told Benzinga: “Billable hours have always punished agencies that work fast and produce value” [source: https://www.benzinga.com/news/topics/26/05/52608863/the-death-of-the-billable-hour-how-ai-is-killing-traditional-ad-agency-pricing]. When AI multiplies speed without multiplying cost, that punishment becomes a structural margin leak.

Second, buyers stopped accepting time-and-materials as a default. A Deloitte study cited by Consulting Success found 67% of consulting buyers now prefer fixed-fee arrangements, up from 41% three years prior [source: https://www.consultingsuccess.com/how-ai-exposed-the-fatal-flaw-in-billable-hour-consulting]. The same source reports that firms shifting to value-based pricing see an average 43% fee increase in year one across a sample of more than 1,000 firms [source: https://www.consultingsuccess.com/how-ai-exposed-the-fatal-flaw-in-billable-hour-consulting]. The buy side is not waiting for agencies to catch up.

Third, the holding companies moved publicly. WPP, parent of Ogilvy and Wunderman Thompson, announced in 2026 it was shifting away from hours-based billing toward output- and return-based pricing. S4 Capital’s Monks went further: Wesley ter Haar told Digiday the goal is “about 25% of our revenue running in this [subscription] model” by year-end 2026, with one client signed since late 2025 and two more expected to close before the end of Q1 [source: https://digiday.com/media-buying/the-billable-hour-does-not-allow-for-any-meaningful-innovation-s4-capital-builds-subscription-model-for-the-ai-age/]. His framing of why is what every agency owner should pin to the wall: “The billable hour does not allow for any meaningful innovation, which clients understand” [source: https://digiday.com/media-buying/the-billable-hour-does-not-allow-for-any-meaningful-innovation-s4-capital-builds-subscription-model-for-the-ai-age/].

Jason Cobbold, who chairs the IPA Commercial Leadership Group and co-authored the IPA Pricing Playbook published in March 2026, said the same thing more diplomatically: “Agencies know the ground is shifting under them, but too often they fall back on familiar models because they feel safe” [source: https://ipa.co.uk/news/pricing-playbook]. His sharper line, in Creative Salon: “As machines absorb execution, human judgement becomes the scarcest asset” [source: https://creative.salon/articles/features/ipa-jason-cobbold-pricing-playbook].

That is the repricing thesis in one sentence. Stop selling execution by the hour. Start pricing judgment by the outcome.

Three pricing models that replace the billable hour in 2026

For a 30-person services agency, three commercial structures are doing real work in 2026. Not five categories from a vendor blog. Three operational contract shapes that can be signed today.

Retainer of outcomes

A retainer of outcomes is a monthly fee structured in two layers: a floor that covers the team, tooling, and minimum service level, plus an accelerator tied to a single measurable KPI (qualified pipeline, MQL-to-SQL conversion, revenue lifted, retention, hires made). The accelerator is paid quarterly or per milestone, capped to protect the client’s downside and the agency’s forecasting.

This is the structure that preserves recurring revenue while uncoupling fee from time. It works for clients with stable, measurable, attributable outcomes (paid media, lifecycle marketing, SDR-as-a-service, recruitment marketing). It does not work where attribution is contested or where the agency does not control the levers.

An output menu is a published catalog of fixed-price productized deliverables: a landing page is $X, a 30-day paid social sprint is $Y, a brand sprint is $Z. According to MediaPost, FIG “rebuilt its economics by strictly separating price from staffing. Time is now used only as an internal check” [source: https://www.mediapost.com/publications/article/413193/billable-hours-are-dead-ai-killed-them-heres-ho.html]. The same source reports 72andSunny moved “away from FTE-based pricing toward a modular product menu with fixed fees” [source: https://www.mediapost.com/publications/article/413193/billable-hours-are-dead-ai-killed-them-heres-ho.html].

The output menu is the right starter model for project-heavy agencies (design studios, dev shops, content agencies). It is the easiest to sell because the buyer recognizes the shape. It scales because every deliverable has a known cost basis. It benefits most from the agency margin AI productivity gain, because margin expansion flows directly to the agency without reopening the commercial conversation.

AI subscription

An AI subscription is an annual bundle in which one predictable fee covers a defined scope of access (the team), capability (the AI workflows the agency has built), and output (whatever the models and team can produce within an agreed envelope). Token costs are handled either with an explicit pass-through clause or a generous fixed allowance plus overage thresholds.

This is the S4 Capital Monks shape. MediaPost describes it as combining “talent, technology, and continuous improvement into one predictable annual fee” [source: https://www.mediapost.com/publications/article/413193/billable-hours-are-dead-ai-killed-them-heres-ho.html]. Per Wesley ter Haar in Digiday, “something that once delivered 50 of these a month might reach 70 as models get better and pipelines get smarter” without triggering a contract rewrite [source: https://digiday.com/media-buying/the-billable-hour-does-not-allow-for-any-meaningful-innovation-s4-capital-builds-subscription-model-for-the-ai-age/]. The client gets compounding output. The agency keeps the compounding margin as the cost curve falls under it.

Comparison: which pricing model fits which client

This table maps the three contract shapes to the client profile, revenue characteristics, AI-margin behavior, and primary risk so a 30-FTE agency can select per account.

Model	Best client profile	Named examples	Revenue shape	Where AI margin goes	Main risk
Retainer of outcomes	Stable account, attributable KPI, mature buyer	Performance-led shops, recruitment marketing teams	Predictable floor + variable accelerator	Shared via accelerator tiers	KPI dispute, attribution drift
Output menu	Project-heavy, transactional buyer, design/dev/content	FIG, 72andSunny	Lumpy but high-margin per unit	Captured fully by agency	Scope creep, no recurring lock-in
AI subscription	Sophisticated buyer with continuous output need	S4 Capital’s Monks	Smooth annual revenue	Captured by agency until renegotiation	Token cost spikes, scope balloon

Use the IPA Pricing Playbook five-axis test to pick: appetite for risk, levels of flexibility, stability of scope, commercial fluency, and confidence and measurability of outcome [source: https://ipa.co.uk/news/pricing-playbook]. A high-trust client with a clear KPI and a long horizon points to retainer of outcomes. A transactional, scope-volatile relationship points to the output menu. A high-volume, capability-led account points to subscription.

Token economics for sub-$10M agencies

Token economics is the new variable-cost line item in an agency P&L driven by inference costs from large language models. According to Digiday, a single recent Coca-Cola AI ad campaign required 70,000 prompts and millions of tokens [source: https://digiday.com/marketing/agencies-grapple-with-economics-of-a-new-marketing-currency-the-ai-token/]. For a holdco that is a rounding error. For a 30-seat shop, that is the difference between a profitable engagement and a loss.

Four real billing patterns are emerging at named agencies, per Digiday. How to pass AI cost through to clients is the contract decision that follows [source: https://digiday.com/marketing/agencies-grapple-with-economics-of-a-new-marketing-currency-the-ai-token/]:

Pattern	Named agencies	How it works	Best when
Pass-through as a line item	Merge, Big Spaceship	Tokens itemized on the invoice like production costs	Client is sophisticated and inference is material (>10% of fee)
Absorb into the fee	RPA, Anomaly	Inference baked into master fee, agency eats variance	Inference is low share of fee and predictable
Bulk-negotiate, no markup	Pencil, Brandtech	Volume discounts negotiated directly with Anthropic and OpenAI, passed through at cost	Agency has scale and wants pricing trust
Bulk-buy with itemized billing	Lerma	Agency buys tokens in bulk, bills client line-by-line	Mixed client base with different consumption profiles

For a sub-$10M agency, the practical rule of thumb is: pass through above 10% of fee, absorb below it, and renegotiate provider contracts the moment any single client crosses six figures in projected annual inference. Embedding token cost silently inside the retainer is the path of least friction with the buyer and the highest variance for the agency. Choose it on purpose, not by default.

A note on the experience curve. Anthropic’s March 2026 Economic Index report found that “people in this higher-tenure group have a 10% higher success rate” on Claude conversations [source: https://www.anthropic.com/research/economic-index-march-2026-report]. Once Anthropic’s researchers controlled for task type and geography the effect moderated to roughly 4 percentage points, but the direction held [source: https://www.anthropic.com/research/economic-index-march-2026-report]. The pricing implication: who runs the prompt is now a real billable input. A senior operator with 200 hours of Claude experience produces materially better output than a junior with the same access. Price the operator, not the seat.

The same Anthropic report shows the average task value on Claude.ai dropped from $49.30 to $47.90 as adoption spread into lower-wage work [source: https://www.anthropic.com/research/economic-index-march-2026-report]. AI is not just compressing delivery for high-end tasks. It is widening the floor of what any business can self-serve. That is structural pressure on the bottom end of any agency service line. Price the top end accordingly.

The 12-week migration plan

The 12-week migration plan is the operational sequence a 30-FTE US agency runs to transition from hourly billing to value-based pricing without losing accounts or margin. Three weeks per phase, four phases.

Weeks 1 to 3: utilization audit and portfolio segmentation. Pull the last twelve months of timesheets out of Harvest or Toggl. Calculate true billable utilization per FTE and per account. Tag every active client into one of three buckets: Tier 1 (top 20% of revenue, strategic, willing to renegotiate), Tier 2 (middle 50%, stable, contract-bound), Tier 3 (bottom 30%, transactional, often unprofitable on a fully loaded basis).
Weeks 4 to 6: model selection and contract drafting. Score each Tier 1 account on the IPA Pricing Playbook five-axis test [source: https://ipa.co.uk/news/pricing-playbook]. Draft one retainer-of-outcomes scaffold, one output menu, and one AI subscription, with token clauses for each. Validate pricing floors against the new utilization baseline calculated in phase 1.
Weeks 7 to 9: shadow-pricing pilot. Take the next five SOWs the team scopes for AI-augmented delivery and shadow-price them under all three new models in parallel with the current hourly quote. Do not show the client. The purpose is to calibrate the agency’s internal pricing logic against real demand. Pilot the live contract on two willing Tier 1 accounts, ideally with quarterly KPI review baked in.
Weeks 10 to 12: Tier 1 migration, sales script, comp plan rewrite. Move the rest of the Tier 1 retainers onto the new contract at renewal. Rewrite the sales pitch to lead with outcome and capability, not hours. Rewrite account manager compensation: utilization is no longer a useful incentive when AI compresses delivery. Tie variable comp to outcome attainment and margin instead.

Tier 3 either gets pushed to the output menu at a clean price, or it gets fired. Margin discipline is the whole point of doing this.

Contract scaffolds you can paste

These are three short clause shapes, not legal advice. Adapt with counsel.

Retainer of outcomes. “Client engages Agency for a monthly Service Fee of $X covering [scope]. In addition, Agency will earn a Performance Fee calculated as Y% of [KPI uplift / qualified pipeline / revenue attributable per agreed model] per quarter, capped at $Z. KPI measurement methodology is defined in Schedule A and reviewed jointly each quarter. Either party may propose a methodology revision at renewal.”

Output menu. “Agency offers the following Productized Deliverables at the fixed prices listed in Schedule A. Each deliverable includes the scope, acceptance criteria, and revision rounds defined therein. Out-of-scope work is quoted separately at Agency’s then-current rate card. Agency reserves the right to use AI-assisted production in delivery; quality and accountability remain Agency’s.”

AI subscription. “Client subscribes to Agency’s [Tier] AI-Native Service for an annual fee of $X, billed [monthly / quarterly]. The Subscription includes [defined scope of access, capability, and output]. Inference and third-party model costs incurred above an annual allowance of $Y are passed through at cost without markup, billed monthly with usage detail. Output capacity is expected to grow over the contract term as model capability improves at no additional fee to Client.”

The last sentence of the subscription clause is what makes the Monks model durable. The client gets the upside of better models; the agency keeps the cost-curve dividend.

What to watch for

Three failure modes appear repeatedly when agencies migrate.

Failure mode 1: shadow billing. The sales team quotes a retainer of outcomes but the delivery team manages the account as if it were still hourly. Six months in, the agency is over-servicing without the price to cover it. Fix: kill internal hour quotas as a performance metric. Replace with margin per account and KPI attainment.

Failure mode 2: token surprise. A client doubles their content cadence in Q3, inference bills triple, and the pass-through clause was written too soft to enforce. Fix: monthly inference reporting baked into the contract, with a renegotiation trigger at a defined threshold.

Failure mode 3: judgment dilution. Cobbold’s line about human judgement becoming the scarcest asset cuts both ways [source: https://creative.salon/articles/features/ipa-jason-cobbold-pricing-playbook]. If senior staff spend their time supervising AI output instead of selling strategy, the agency erodes the only premium it has left to price. Protect senior time aggressively. Move execution onto AI plus mid-level operators. Sell the strategic layer at a price that reflects what it actually costs to produce.

Glossary

Billable hour: the legacy unit by which agencies sold execution time at a fixed hourly rate.
Retainer of outcomes: a monthly fee with a floor (team and tooling) plus a performance accelerator tied to a measurable KPI.
Output menu: a published catalog of fixed-price productized deliverables (landing page, sprint, brand package).
AI subscription: an annual bundle covering access, capability, and output, with explicit handling of token costs.
Token pass-through: a contractual mechanism where inference costs are itemized on the client invoice without markup.
IPA five-axis test: the IPA Pricing Playbook’s framework for selecting a pricing model based on risk appetite, flexibility, scope stability, commercial fluency, and outcome measurability [source: https://ipa.co.uk/news/pricing-playbook].

Where to start this week

If you read this far, do one thing before Friday: pull the last six months of timesheets and calculate billable utilization per FTE and per account. Not what the Harvest or Toggl dashboard says. The real number, including unbillable cleanup time. That single audit typically surfaces the two or three Tier 3 accounts quietly burning margin and the one or two Tier 1 accounts ready for an outcome conversation today.

If the math points to the $4.4M cliff and you want a second set of eyes on the migration plan before writing the first new contract, book a Kreante AI-Native pricing audit and we will walk it with you.

And if you have already migrated even one account onto a non-hourly model, tell me what broke. The MediaPost commenters disputing the 30% margin figure are right about one thing: the agency P&L has been quietly contested for years. AI just made the argument unavoidable.

Value-Based Pricing for AI Agencies: Repricing a 30-Person Shop in 2026

Value-Based Pricing for AI Agencies: Repricing a 30-Person Shop in 2026

TL;DR

Key Takeaways

The $4.4M revenue cliff hiding in a 30-seat P&L

Why the billable hour stopped working

Three pricing models that replace the billable hour in 2026

Retainer of outcomes

Output menu

AI subscription

Comparison: which pricing model fits which client

Token economics for sub-$10M agencies

The 12-week migration plan

Contract scaffolds you can paste

What to watch for

Glossary

Where to start this week

Frequently asked questions

References