Happy Friday, wizards.

I’ve added data on buy vs build on Context Windows. You can now see in which cases AI implementations are bought or configured by a vendor, and in which cases they’re a custom built solution. (Shoutout to my friend David for the tip)

Here are the essential brewings in AI.

Build vs Buy

I’ve looked at 2,944 case studies since Jan 1 2024 until today, and the trend is consistent:

Year

Build

Buy

2024

74.2%

25.8%

2025

62.8%

37.2%

2026

59%

41%

Companies are buying more of their GenAI implementations and building less.

My hunch is two things: many have discovered that vibe-coded solutions are hard to run in production, and more AI startups for specific tasks and use cases are eventually finding product-market fit.

The data reveals more interesting things. GenAI implementations that are bought generally have more documented impact.

Source: Context Windows

However, impact on build vs buy varies a lot by use case. As you can see in the case of Data Extraction (getting structured data from invoices, contracts, claims, forms, etc), built can have the higher impact too.

The percentage of use cases being built vs boughtβ€”and their impactβ€”varies a lot by which industry and business function it’s being applied to. Context Windows Pro has filters for all of this and more, to make it easier to decide what to go for.

β€”

Have you noticed? The people coming to your site from AI traffic are warm leads. If you don’t believe me, check your Analytics right now and see the average time spent by people coming from ChatGPT vs Google Search.

So..before you head off into the weekend, I recommend setting up tracking for your brand so that you know what people are asking AI about your business. Trendos is this week’s sponsor and lets you track 100 prompts free of charge:

IN PARTNERSHIP WITH TRENDOS

Trendos lets you track brand mentions across ChatGPT, Gemini, Perplexity, and more.

If you’re paying $100+ just to see how LLMs mention your brand, you’re overpaying. Trendos gives you 100 custom prompts on their free plan.

❦

Opus 4.7 and the problem with its Adaptive Thinking

This morning as I was using Claude Code, I experienced the worst performance I have seen in a long time. Felt as if it had regressed a full year or more in terms of capability.

Among other things, when I was making some changes to my website, it spent 30 minutes on a Z-index issue (IYKYK) that should’ve been straightforward, and it ended up not fixing the problem, because it was working on the wrong page.

I was really scratching my head about this until I noticed Opus 4.7 launched yesterday. It has a new β€œadaptive” reasoning which, as far as I understand, adjusts how hard it reasons depending on the task.

The solution to fix this in Claude Code is simple, but hidden. Just type /effort and set the reasoning level to max.

Claude Code lets you set the reasoning level, but defaults to high, which is pretty useless compared to max

Apparently β€” it’s not just affecting coding but also any other tasks you use Claude and Claude Cowork for as well. And unfortunately, for Claude and Claude Cowork, there’s no way to adapt the reasoning level, so you’re basically playing roulette about what reasoning level (and result) you’re going to get.

Claude, I’m feeling lucky! Let’s see what reasoning level I will get this time.

Why is Anthropic doing it this way?

Anthropic won’t admit it (they say it’s what users want) but they’ve historically had this annoying way of creating what I’d classify as a semi-dark UX pattern by switching the defaults on which model and reasoning level is being used.

And of course, it’s always a combination that’s cheaper to run than their best configuration. I believe it’s entirely a cost calculation for them; many people won’t notice and won’t care what reasoning level they get.

Annoying? Yes. Understandable? Also yes β€” because most people are running Claude at a fixed monthly subscription price which wasn’t designed for the token-guzzling long-haul tasks Claude is doing now, and Anthropic is selling those tokens at a loss.

After users began complaining, Anthropic is already trying to remedy the situation by defaulting to higher reasoning levels more often.

So is Opus 4.7 any good?

When you switch to Max, yes. I haven’t had enough time with it yet to say if it’s better than Opus 4.6 but judging from the benchmarks, I’m pretty sure it is.

❦

πŸ’­ Did you enjoy this read? Something you want more/less of? There’s a poll at the bottom, and it needs your vote + comment in it.

You are a delight.

Dario

Disclosure: To cover the cost of my email software and the time I spend writing this newsletter, I sometimes work with sponsors and may earn a commission if you buy something through a link in here. If you choose to click, subscribe, or buy through any of them, THANK YOU – it will make it possible for me to continue to do this.

Keep Reading