Claude Opus 4.8 Review: A Slight Upgrade, and Why Cost Wins
Opus 4.8 is a real but small step up. The bigger story is cost control, and Microsoft just showed why with its Claude Code rollback.
On 28 May 2026, Anthropic released Claude Opus 4.8. I've been using it since launch, and my short verdict is this: it's a real upgrade, and a small one. The official announcement leads with stronger coding, faster and denser analysis, and better alignment. All plausible, some of it I can feel. But the part that actually matters to how I work didn't change with the model. It's cost. And while I was testing 4.8, Microsoft handed everyone a very expensive case study for why.
What 4.8 actually changes
Start with the concrete stuff. According to Anthropic's announcement, 4.8 builds on 4.6 and cleans up the things that annoyed people about 4.7.
It fixes 4.7's rough edges. The comment-verbosity problem, where the model padded code with explanations nobody asked for, is gone. So is the flaky tool-calling. If you skipped 4.7 because of those, that alone is a reason to move.
It catches its own mistakes. Anthropic says 4.8 is roughly four times less likely than its predecessor to leave a flaw in its own code without flagging it. In agentic work, where the model writes, runs, and checks its own output across many steps, that self-correction is worth more than a raw capability bump.
High effort is now the default. With 4.7 you had to reach for higher effort levels deliberately. 4.8 ships on high by default, which Anthropic frames as the best balance of quality and experience. On coding tasks, they say this default spends about the same tokens as 4.7's default, just with better results.
Dynamic Workflows arrived in research preview. This one is interesting: Claude can plan a large task and then spin up hundreds of parallel subagents in a single Claude Code session. I haven't pushed it hard yet, but the direction is clear, more autonomy over bigger chunks of work.
And the number that didn't move: pricing. Still $5 per million input tokens and $25 per million output, $10 and $50 for fast mode. Same as 4.7.
My take: better, but not a leap
I run Opus 4.8 on high effort, always. For my work anything below high is basically useless, so the new default suits me fine.
What I notice over 4.7 is subtle. It asks better questions back. When a task is underspecified, 4.8 is a bit more likely to stop and check than to guess and run. And it holds complex, multi-part tasks together slightly better, the kind where context drifts and earlier models start losing the thread. These are small wins, but they're the right small wins.
Writing with it is better too, with one big caveat: only if you manage context well and give it enough reference to you, your tone of voice, and your brand. The model doesn't know how you sound. Feed it a thin prompt and you get competent, generic text. Feed it your voice guide and real examples and 4.8 stays on voice longer than 4.7 did. The model improved; the homework didn't go away. If you've never set that up, my post on developing a brand voice is where I'd start.
So: a slight upgrade. Nothing groundbreaking. For my use case that's an honest summary, and given there's no price increase, an easy recommendation.
The token question, again
Here's a small tension I can't fully resolve. Anthropic says 4.8's default spends roughly the same tokens as 4.7. In my own use it feels like it eats more.
I think the explanation is mundane. High effort is the default now, outputs are denser, and I'm comparing my everyday usage rather than running a controlled test. Whether the per-task number went up or just my perception did, the practical takeaway is the same as it was for 4.7: token cost is the thing to watch, not the price per token.
I keep Opus affordable on the 22-euro Pro plan with two habits. First, I don't use Opus for everything. My favourite skill, Superpowers, routes simple checking and verification steps to Sonnet and saves Opus for the complex planning and reasoning. Right model for the right step, automatically. Second, I compress. With the Caveman skill, which strips conversational padding from how the model talks back, I can run Opus for a surprisingly long stretch before hitting limits, even on Pro.
None of this is new advice. I made the same argument for Opus 4.7 and the broader case in my post on context engineering. 4.8 doesn't change the logic. It reinforces it.
The cost reckoning is here
In the 4.7 piece I called it the token trap: you build workflows on a stronger, hungrier model, get used to the quality, and slowly lock yourself into a dependency that's painful to exit. That was a prediction. It's now happening, and not to hobbyists.
Microsoft's Experiences & Devices division is stopping internal use of Claude Code by 30 June 2026 and moving developers to GitHub Copilot CLI. The reason was token cost: the team blew through its annual AI budget months early. By several reports the developers liked Claude Code. The bill is what killed it. Uber tells a similar story, having burned through its planned 2026 AI coding budget in about four months.
One thing to be precise about, because the headlines blurred it: nobody "cancelled Claude." Microsoft simultaneously joined a $5 billion investment in Anthropic and offers Claude across Azure. What got pulled was the Claude Code product, internally, because token-based billing at the scale of thousands of developers got too expensive. The distinction matters. It wasn't a verdict on the model. It was a verdict on the cost structure.
That's the whole point. When even the engineers who love the tool can't keep it because of token spend, the model number stops being the interesting variable. Cost control becomes the skill.
What this means in practice
A few clear takeaways from using 4.8 and watching the cost story unfold:
- Upgrade to 4.8. There's no price increase and the 4.7 annoyances are fixed. Low-risk, mildly positive.
- Don't expect a leap. If 4.7 worked for you, 4.8 is a quiet quality bump, not a reason to rebuild workflows.
- Route by difficulty. Not every step needs Opus. Send simple checks to Sonnet and reserve Opus for planning and hard reasoning. For where that line sits, see my Sonnet vs Opus comparison.
- Treat cost as a first-class design constraint, not an afterthought. Microsoft and Uber didn't fail to notice the bill. They noticed too late to redesign around it.
Anthropic shipped a better model in Opus 4.8. That's genuinely good, and at the same price it's a clean win. But the lesson of this release cycle isn't about the model at all. It's that context management and cost control are now the part that decides whether any of this is sustainable. The model keeps getting better. Make sure you can still afford it.
FAQ
- What's new in Claude Opus 4.8?
- It fixes the comment-verbosity and tool-calling problems from Opus 4.7, is about four times less likely to leave flaws in its own code unremarked, and now defaults to high effort. It also adds Dynamic Workflows in research preview, which lets Claude Code plan a task and run hundreds of parallel subagents in one session. Pricing is unchanged: $5 per million input tokens, $25 per million output.
- Is Opus 4.8 worth upgrading to?
- For most people, yes, because there's no price increase and the 4.7 annoyances are gone. But don't expect a leap. In my daily use it's a slight upgrade: marginally better at asking clarifying questions and at holding long, complex tasks together. If 4.7 already worked for you, 4.8 is a quiet quality bump, not a reason to rebuild anything.
- Does Opus 4.8 use more tokens than 4.7?
- Anthropic says the new default high-effort level spends roughly the same number of tokens as 4.7's default on coding tasks, with better results. In my own use it feels like it consumes more, probably because high effort is now the default and outputs are denser. Either way, token cost stays the thing to watch. The way to keep it down is routing simple steps to a cheaper model like Sonnet and managing context tightly.
- Why is Microsoft dropping Claude Code?
- Microsoft's Experiences & Devices division is stopping internal use of Claude Code by 30 June 2026 after token spend blew through the team's annual AI budget, moving developers to GitHub Copilot CLI. It's not a rejection of Claude itself: Microsoft joined a $5 billion investment in Anthropic and offers Claude on Azure. The problem was specifically the token cost of Claude Code at scale.
