Claude Opus 4.7 Released! The Arrival of Effort xhigh and a Glimpse of the Power of Mythos
Table of Contents
Hello. On April 16, 2026, Anthropic publicly released Claude's new model, "Claude Opus 4.7".
It's only been a little over two months since Opus 4.6 appeared, and the next one is already out. Anthropic's speed of evolution continues to amaze me.
When I started a new session in Claude Code in my environment, the defaults had switched to Opus 4.7 and Effort xhigh. Note that the default model for Claude Code varies by plan. Opus 4.7 is the default for Max / Team Premium, Sonnet 4.6 for Pro / Team Standard / Enterprise and direct Anthropic API usage, and Sonnet 4.5 via Bedrock / Vertex / Foundry (see Claude Code Model configuration for details).
In this article, based on the official announcement, I will summarize the features of Opus 4.7 and the changes from Opus 4.6.
Major Evolution Points of Opus 4.7
1. Further Enhancement of Coding Capabilities
As Anthropic themselves acknowledge, Opus 4.7 has reached a level of reliability where it can be "trusted with the most difficult coding tasks without the close supervision required previously."
Looking at the specific benchmark results shows this evolution.
- CursorBench - 70% for Opus 4.7 compared to 58% for Opus 4.6
- Rakuten-SWE-Bench - Solved more than 3 times as many production tasks compared to Opus 4.6
- CodeRabbit - Recall (reduction in missed detections) improved by over 10%
- Databricks' OfficeQA Pro (document reasoning and source information tasks) - Errors reduced by 21%
- Terminal-Bench 2.0 - Cleared 3 TBench tasks that previous Claude models could not solve
The "3 times" on Rakuten-SWE-Bench is particularly impactful. A difference of this magnitude in a benchmark close to actual operations means that the daily development experience should change accordingly.
2. Significant Improvement in Vision Performance
The evolution in image processing cannot be overlooked either.
Opus 4.7 can now process images up to 2,576 pixels (approx. 3.7 megapixels) on the longest edge. This is more than three times the resolution of previous models. In use cases like passing screenshots or diagrams for it to read, this difference in resolution is likely to be quite effective.
And the results of the XBOW visual accuracy benchmark are amazing.
- Opus 4.6 - 54.5%
- Opus 4.7 - 98.5%
The accuracy has nearly doubled. It is safe to say that it has reached a practical level for tasks such as extracting elements from UI screenshots or reading information from hand-drawn diagrams.
3. Introduction of the New Effort Level "xhigh"
A new level has been added to the effort controls introduced in Opus 4.6.
- low
- medium
- high
- xhigh (newly added)
- max
xhigh (extra high) is positioned between high and max. It allows for subtle adjustments like, "I don't want the latency of max, but I want it to think deeper than high."
In Claude Code, the default effort when Opus 4.7 is selected is xhigh. When I opened the /model command on my end, it was already set that way.
If you want to change it manually, you can adjust the effort item with the left and right keys while Opus is selected via /model, just like before. It naturally allows you to lower it to medium or high when the task is simple and speed is a priority, or raise it to max when you want it to think carefully.
4. Enhanced Instruction Following
What was striking in the official announcement was this passage:
Opus 4.7 is substantially better at following instructions. Interestingly, this means that prompts written for earlier models can sometimes now produce unexpected results: where previous models interpreted instructions loosely or skipped parts entirely, Opus 4.7 takes the instructions literally.
To paraphrase, "Opus 4.7 has significantly improved in following instructions. As a result, prompts written for earlier models can sometimes produce unexpected behavior. Instructions that were previously interpreted loosely or partially skipped are now taken literally by Opus 4.7."
This is both good news and a point of caution. If you use existing prompts as they are, there are cases where it will result in "behavior exactly as written" rather than the traditional "reading between the lines" behavior, so prompt readjustment may be necessary.
For those who use Claude Code regularly, I recommend checking once to see if your prompts and custom commands work as intended.
By the way, Claude Code has an official skill called claude-api. This is for those developing and operating their own apps using anthropic or @anthropic-ai/sdk, and it compiles knowledge such as how to use the Claude API and migration procedures between models. This skill has now been updated to support Opus 4.7, and simply telling Claude Code to "migrate to Opus 4.7" will reportedly rewrite the model name, prompts, and effort settings all at once (@ClaudeDevs announcement). If you are integrating the Claude API into your app, you should be able to migrate easily from here.
The Existence of Mythos Preview
Another interesting point in this announcement was the mention of Claude Mythos Preview.
In the official announcement, Opus 4.7 is introduced to the effect that "while it does not have the broad capabilities of the most powerful model, Claude Mythos Preview, it shows results that surpass Opus 4.6."
In other words, Anthropic has an even more powerful model called Mythos Preview than Opus 4.7, and Opus 4.7 is positioned in comparison to it.
Looking at the published benchmark charts, the score of Mythos, positioned on the far right, is outstandingly high, bringing the surprise of "there's still something above this..." The fact that there is a model waiting in the wings that goes even further, despite Opus 4.7 having more than enough performance, is an exciting prospect when considering the future evolution of Claude.
Let's also clarify its positioning regarding safety. Before that, I will add a little explanation about the word "alignment," which appears frequently here.
Alignment in AI is a concept that refers to aligning the behavior of an AI model with human intentions, values, and goals. It is easier to understand if you consider it as an evaluation of "whether that capability can be used safely and honestly," on a different axis from capability. Anthropic places importance on this perspective, and specifically describes models with fewer of the following behaviors as "well-aligned":
- Not deceiving or tricking the user
- Not bending facts to prioritize answers that the user might like
- Not easily complying with requests that lead to malicious use
Looking at the official evaluation with that premise, Anthropic positions Opus 4.7 as "largely well-aligned and trustworthy, though not fully ideal," describing it as a model that maintains the same level of safety as Opus 4.6 while improving in some evaluations. On the other hand, it is worth noting that the model considered best-aligned in the company's evaluation is Mythos Preview.
Pricing
Pricing remains unchanged from Opus 4.6.
- Input tokens - $5 per 1M tokens
- Output tokens - $25 per 1M tokens
As a user, I am genuinely grateful that the price remains the same despite such a performance improvement.
Furthermore, since Opus 4.7 uses more thinking tokens than before, the rate limits for all subscribers have reportedly been raised to compensate for this (@bcherny's tweet). It is a nice point that an environment is in place where you can use it extensively without worrying about execution frequency.
In the API, it can be used with the model ID claude-opus-4-7.
Opus 4.6 vs Opus 4.7 Comparison Table
I have summarized the major changes in a table.
| Item | Opus 4.6 | Opus 4.7 |
|---|---|---|
| Effort Level | low / medium / high / max (4 levels) | low / medium / high / xhigh / max (5 levels) |
| Claude Code Default Effort | high (medium for Pro / Max) | xhigh |
| Max Image Resolution | - | Longest edge 2,576px (approx. 3.7 megapixels) |
| XBOW (Visual Accuracy) | 54.5% | 98.5% |
| CursorBench | 58% | 70% |
| Rakuten-SWE-Bench | - | More than 3x compared to Opus 4.6 |
| Pricing (Input/Output) | $5/$25 per 1M tokens | $5/$25 per 1M tokens (unchanged) |
| Instruction Interpretation | Interpreted somewhat loosely | Executed more literally |
Trying it out in Claude Code
In environments where Opus 4.7 is available, you can check it from the /model command in the latest version of Claude Code (v2.1.111 or later according to the official Docs). The default effort in Claude Code when Opus 4.7 is selected is xhigh. Note that visibility and default behavior also depend on the plan, provider, and administrator settings.
If you cannot see it on your end yet, I recommend updating Claude Code and then restarting it.
claude update
After updating, if you open a new session, you will be able to select Opus 4.7 in available environments.
By the way, right after the release, there were reports of cases where Opus 4.7 would reject normal code editing with a warning saying "this might be malware," but this was not because the model was too cautious, but due to an old safety prompt left in an older build (@ClaudeDevs announcement). Running claude update and restarting the app will resolve this, so if you encounter the same symptom, please try updating.
If you are still using Claude Code via Homebrew or npm, I strongly recommend switching to a native installation at this time. Automatic updates will take effect, making update behavior more stable, and keeping up with new features will be smoother. I have summarized the detailed steps and the benefits of switching in How Switching Claude Code from Homebrew to Native Installation Made It More Comfortable, so please refer to it.
My Impressions After Using It
I have only just started using it, but my first impression is that making xhigh the default is a perfect balance.
xhigh gives the feeling of "thinking things through to a necessary and sufficient point," and fits perfectly into daily work. I felt that Anthropic's decision to set the default here is a good tuning that emphasizes practicality. To be honest, I haven't used max yet, so I plan to explore where such a level is needed by trying it out in the future.
Above all, I am personally glad that it executes instructions more literally. Until now, there were times when "loose instructions were arbitrarily interpreted broadly," so the fact that the more accurately you write a prompt, the more it is reflected in the results, is actually an easier characteristic for engineers to handle.
Conclusion
To summarize the evolution points of Claude Opus 4.7, it looks like this:
- Further enhancement of coding capabilities (CursorBench 70%, Rakuten-SWE-Bench 3x)
- Significant improvement in vision performance (XBOW 98.5%, max 2,576px)
- Addition of the new Effort level
xhigh - Executes instructions more literally
- Mention of the existence of a higher-tier model called Mythos Preview (a research preview currently in limited release)
- Pricing remains unchanged
It is surprising that so many improvements have accumulated when only two months have passed since Opus 4.6. Mythos Preview is currently positioned as a research preview in limited release, and Anthropic explains that they aim to make Mythos-class models widely available in the future. I am looking forward to future developments.
If you haven't touched it yet, you can try it from Claude Code, claude.ai, or the API. In Claude Code, if you are on the Max / Team Premium plan, Opus 4.7 and xhigh should be selected by default after updating. Even on other plans, you can select Opus 4.7 from the /model command if it is an available environment, so please experience the difference for yourself.