🧙🏼 More autonomy = more ROI?

Howdy wizards,

Here’s what’s brewing in AI.

I did some analysis and this is what I found:

❝

AI implementations with higher average levels of autonomy have stronger reported business outcomes

I’ll show you how I arrived at that conclusion, and what it means for businesses.

Do AI implementations with higher levels of autonomy mean higher ROI for businesses?

This question popped into my head the other day. So I decided to create the data to answer it for Context Windows.

(In case you’re new here: Context Windows is a database I’ve made of thousands of public AI implementation case studies.)

I noticed looking through stories of companies implementing AI how different they were in terms of how much human involvement they needed.

You have use cases where companies gives AI their business context so their team or customers can chat with it. That treats AI like a convenient tool. Helpful, but not exactly the job displacing revolution you hear VCs shouting about from the mountaintops.

Then there’s the type of implementations that orchestrate multiple workflows, with AI actively making decisions and using tools, in a loop, and without much human intervention. That’s more like an agent, the AI poster-child we’re all so afraid of.

I sat down with Claude and created a framework to classify the Agentic Level of an AI implementation.

The agentic level scale I created. It’s now a filter in Context Windows! The bars in the background give you a feel for where most AI implementations are.

Then I analysed all the case studies through this new lens.

I made sure to base the judgement on the actual mechanisms reported in the case studies, rather than whatever fancy terminology the original publisher used to describe it.

Most companies have moved beyond the level of L1 Tool: one-shot AI prompts that give you a draft, then humans handle the rest.

The majority land on L2 Consultant: AI as a copilot with read-access to your context.

L3 and L4 have an almost equal amount of case studies: multi-step workflows with AI that can take actions, but require human approval (L3), and autonomous execution on clearly scoped tasks where humans are looped in mostly on exceptions and edge cases (L4).

L5 represents the highest level. Fully autonomous agents coordinating across domains. None of the case studies I’ve seen so far qualify.

With that data in hand, I can now tell you which of these levels of autonomy deliver the strongest results for businesses.

But first, I want to make sure you're not sleeping on one of the best AI use cases for your workday.

I'm talking about meeting notes. Not just because they save you time, but because when done right, they help you remember and act on what you decided during the meeting.

Download Granola for yourself and see what I’m talking about:

IN PARTNERSHIP WITH GRANOLA

”Wait, what did we decide?”

You know that feeling when you leave a meeting and immediately forget half of what you agreed to?

That's not a memory problem. When you're back-to-back all day, there's simply no time to process. No time to write the follow-up.

Granola helps you become the person who actually does what they said they'd do.

You take notes during the meeting. Just quick bullets, nothing formal. Granola transcribes in the background and turns those notes into clear summaries with actual next steps.

No more "wait, what did we decide?" moments. Just clarity. And follow-through.

Download Granola and try it on your next meeting →

Free month with the code WHATPLUGIN

❦

Back to the juicy topic at hand: level of autonomy and business outcomes.

This chart tells the story better than I do:

There you have it.

AI implementations with higher average levels of autonomy have stronger reported business outcomes.

Note: To understand this chart, you need to understand the Proven Impact score (the y-axis). It's a composite score that weighs how diverse and how concrete the reported outcomes in a case study are. ROI, revenue impact and cost savings weigh more. Operational and time saving results weigh less.

For the sake of humanity I would have liked to say that the implementations that heavily depend on humans yield just as much financial gain for a business.

However, the linearity in the data is undeniable:

Agentic level	Concrete financial results*
L1 Tool (n=207)	6%
L2 Consultant (n=1,394)	11%
L3 Collaborator (n=353)	17%
L4 Expert (n=397)	22%

*share of implementations that report concrete cost savings, revenue gains or ROI

So what are these “L4 Expert” level case studies really about?

Let’s look at the two in the top right corner of the chart to better understand what autonomy looks like.

Lead qualification is Sales’ favourite use case. It lets businesses handle leads they would otherwise miss because they arrive outside business hours or their sales team has limited capacity. An agent picks up a variety of signals to analyse and routes the high-value leads directly to human reps. I've written more details on why this use case is golden here.
Agentic Customer Service is the annoying chatbot your useless Telecom provider has on their website BUT with an important difference. It has API access to the business’ backend, a range of tools so it can actually help you with your problem, and the authority to make real decisions. Companies report resolution rates of 75%, on average. It's also quick and easy to implement; companies don’t build it themselves, they buy it from a SaaS that built it and packaged it nicely (avg deployment time is less than 2 months).

If you’re deciding where to invest or figuring out what to push for internally in your company when it comes to AI, I'd look into these, regardless of your industry.

The autonomous end of the spectrum is where you’re likely to see the easiest, most measurable gains from AI. In the successful implementations, AI often directly executes something that already had a price tag.

Then there’s the use cases that sit on the on the low to mid-autonomy side (left side of the chart).

This is where AI amplifies a lot of human work these days. AI’s contribution is often one step removed from the financial outcome, which is why these use cases often report less “hard” results.

Here’s some implementations that are low autonomy: Connecting AI to your domain's knowledge so people can get answers more easily, building a workflow for content drafting so you can make content quicker, integrating AI with your data so people can self-serve more.

These automate part of what a human was doing before, which saves them time, and that’s great.

What’s less than great is that instead of measuring what value is being added by the stuff you do in that surplus of time, most companies make “time saving” their main way of measuring success.

That thinking stops short of what moves the needle for the business.

An underrated challenge in AI is, ironically, figuring out the high-level, differentiated work humans can do with the time you save.

That's the real ROI.

PS I also think use cases with mere time savings as their performance indicators are increasingly going to face pressure in a world where token economics is a burning topic.

❦

Hope that was insightful.

If you enjoyed the analysis, Context Windows Pro gives you all the data behind it, and lets you see exactly which use cases are worth investing in for your specific industry.

You are a delight.

Dario

How was today's brew?

Leave me a small feedback. Votes w/o context don't help me much. Thankee 🤝

See which AI use cases are paying off

Most companies pick AI use cases by brainstorming internally. 90% of those initiatives fail.

I’ve created Context Windows just so you can pick the winners.

🟦 Find high-performing use cases from 2,000+ companies at contextwindows.ai, or book a demo with me

Disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes work with sponsors and may earn a commission if you buy something through a link in here. If you choose to click, subscribe, or buy through any of them, THANK YOU – it will make it possible for me to continue to do this.

”Wait, what did we decide?”

How was today's brew?

See which AI use cases are paying off

Keep Reading

What's brewing in AI