Claude Fable 5 Review 2026: Anthropic's Mythos-Class Model Shakes the AI Industry

Introduction

On June 9, 2026, Anthropic released Claude Fable 5 — and the AI world was never the same. In what may be the most dramatic model launch in AI history, Fable 5 debuted as the most capable generally-available AI model ever built, only to be suspended by the U.S. government just three days later on national security grounds. The model's benchmark scores were staggering: 95.5% on SWE-bench Verified, 80.3% on SWE-Bench Pro (an 11-point lead over the next-best model), and state-of-the-art results across virtually every tested capability. Andrej Karpathy called it "a major-version-bump-deserving step change forward." Cursor CEO Michael Truell said it "opened up a class of long-horizon problems that were out of reach."

This review covers everything you need to know about Claude Fable 5: its capabilities, benchmark performance, pricing, the controversial safety architecture, the government suspension, and what it all means for developers and businesses.

What Is Claude Fable 5?

Claude Fable 5 is the first publicly available model in Anthropic's new Mythos class — a tier above the existing Opus, Sonnet, and Haiku hierarchy. The name "Mythos" comes from the Greek word for "story" or "discourse," while "Fable" derives from the Latin fabula ("that which is told"). The naming reflects the model's dual nature: Fable 5 and Claude Mythos 5 are the same underlying model with identical weights. The only difference is that Fable 5 has safety classifiers layered on top, while Mythos 5 has those classifiers lifted in certain restricted domains and is available only to vetted organizations through Project Glasswing.

This two-model release strategy is unprecedented in the AI industry. Anthropic's rationale is straightforward: the same capabilities that make a frontier model exceptional at finding and fixing software vulnerabilities also make it potentially dangerous in the wrong hands. By splitting the model into a safeguarded public version and a restricted defense-oriented version, Anthropic attempted to balance capability with responsibility.

Technical Specifications

Specification	Claude Fable 5
Model ID	claude-fable-5
Context Window	1,000,000 tokens (default)
Max Output	128,000 tokens per request
Input Pricing	$10 per million tokens
Output Pricing	$50 per million tokens
Thinking Mode	Adaptive thinking (always on)
Safety Level	ASL-3 (CB-1 classification)
Release Date	June 9, 2026
Suspension Date	June 12, 2026 (US export directive)

The 1 million token context window and 128K max output are significant upgrades over Opus 4.8. The adaptive thinking mode is permanently enabled and cannot be disabled — a deliberate design choice by Anthropic to ensure the model always reasons through complex problems before responding. Notably, the raw chain-of-thought is never returned to API users; instead, a configurable thinking.display setting offers either a readable summary or an empty field.

Benchmark Performance: A Generational Leap

The benchmark results for Claude Fable 5 represent not an incremental improvement but a generational leap. Here is how it compares to the competition:

SWE-Bench Verified (Software Engineering)

Model	Score
Claude Fable 5 / Mythos 5	95.5%
Claude Opus 4.8	88.6%
GPT-5.5	82.6%
Gemini 3.1 Pro	78.8%

SWE-Bench Pro (Harder Real-World Engineering)

Model	Score
Claude Fable 5	80.3%
Claude Opus 4.8	69.2%
GPT-5.5	58.6%
Gemini 3.1 Pro	54.2%

Other Key Benchmarks

Benchmark	Fable 5	Opus 4.8	GPT-5.5	Gemini 3.1 Pro
FrontierCode Diamond	29.3%	13.4%	5.7%	—
Terminal-Bench 2.1	88.0%	82.7%	83.4%	70.7%
Humanity's Last Exam	59.0%	49.8%	41.4%	44.4%
Legal Agent Benchmark	13.3%	10.4%	2.1%	0.0%

The 11-point gap between Fable 5 and Opus 4.8 on SWE-Bench Pro is larger than the gap between Opus 4.8 and Google's Gemini 3.1 Pro. On FrontierCode Diamond, Fable 5 more than doubles Opus 4.8's score. These are not marginal improvements — they represent a fundamental capability jump.

Real-World Performance

Software Engineering

The most compelling evidence of Fable 5's capabilities comes from real-world deployments. Stripe reported that a migration of their 50-million-line Ruby codebase — a task estimated to take a human engineering team over two months — was completed by Fable 5 in a single day. The model autonomously wrote tests, verified its outputs, and executed the migration with minimal human oversight.

Boris Cherny, who built Claude Code, described Fable 5 as "the first model I have used that was so methodical and precise, taking measurements and adding logs then verifying that it truly fixed the issue before declaring victory."

Simon Willison, a prominent AI commentator, noted: "After two days of experience with Claude Fable 5 I think the best way to describe it is relentlessly proactive. It knows a whole lot of tricks and it will deploy pretty much any of them to get to its goal."

Vision Capabilities

Fable 5's vision system achieved state-of-the-art results on multiple benchmarks. The model can extract numbers from scientific figures, rebuild web applications from screenshots, and even completed the game Pokemon FireRed using only vision input — reading the screen and generating appropriate game inputs.

Long-Context Tasks

Using file-based memory, Fable 5 played the strategy game Slay the Spire three times better than Opus 4.8. The model's ability to maintain coherence and make strategic decisions across millions of tokens of context is unmatched by any competing model.

Scientific Research

Fable 5 excels at scientific reasoning tasks. On Humanity's Last Exam (a benchmark designed to test the limits of AI knowledge), it scored 59.0% without tools — compared to 49.8% for Opus 4.8 and 41.4% for GPT-5.5. The model shows particular strength in long problem-solving sessions on very difficult scientific problems.

The Safety Architecture

Fable 5 introduces a novel safety mechanism: automatic fallback to Opus 4.8. When the model's safety classifiers detect a request in a high-risk domain (cybersecurity, biology, chemistry, or model distillation), the request is silently rerouted to Claude Opus 4.8 instead of being answered by Fable 5. Anthropic reports that at least 95% of user sessions run entirely on Fable 5 without triggering the fallback, meaning safety blocks are infrequent in practice.

However, this fallback mechanism has drawn criticism. Some users reported that the safety filters were overly aggressive, blocking high-school-level biology questions and legitimate chemistry research queries. The silent nature of the fallback — users are not informed when their request has been downgraded to Opus 4.8 — has also raised concerns about transparency.

Mandatory 30-Day Data Retention

Perhaps the most controversial aspect of Fable 5 is its mandatory 30-day data retention policy. Both Fable 5 and Mythos 5 are designated "Covered Models" that carry a mandatory 30-day retention window and are not available under zero data retention. Anthropic states that retained prompts and outputs are not used for training and are deleted after 30 days, except where held for safety investigations or legal obligations.

This requirement immediately created friction. Microsoft removed Fable 5 from its internal Copilot model picker on June 10 because the retention term conflicts with the company's zero-retention standard, even as it kept the model available to external customers.

Anthropic frames the retention window as the cost of running its safety classifiers, which need cross-request visibility to catch patterns that real-time checks miss, including best-of-N jailbreaking and larger campaigns such as state-sponsored espionage.

Pricing Analysis

Tier	Input (per MTok)	Output (per MTok)	vs. Opus 4.8
Claude Fable 5	$10	$50	2x
Claude Opus 4.8	$5	$25	Baseline
GPT-5.5	$5	$15	1x input, 0.6x output
Gemini 3.1 Pro	$3.50	$10.50	0.7x input, 0.42x output

Fable 5 is priced at exactly double Opus 4.8 on both input and output tokens. Compared to GPT-5.5 and Gemini 3.1 Pro, Fable 5 is significantly more expensive. However, Anthropic argues that the capability gains justify the premium for agentic workloads where Fable 5 can replace hours of human engineering time.

For routine tasks, the cost math is less favorable. Running standard chat or simple coding tasks on Fable 5 at 2x the price of Opus 4.8 — a model that is itself highly capable — may not provide sufficient return on investment. The model's value proposition is strongest for long-horizon agentic tasks, complex software engineering, and scenarios where Fable 5's autonomy can replace significant human effort.

Free Access Period

Through June 22, 2026, Fable 5 was included at no additional cost in Claude Pro, Max, Team, and seat-based Enterprise plans. After that date, Anthropic planned to transition the model to a consumption-based credit system.

The Government Suspension

On June 12, 2026 — just three days after its public release — Claude Fable 5 and Mythos 5 were suspended by a U.S. government export directive signed by Commerce Secretary Howard Lutnick. The directive required Anthropic to pause all access by foreign nationals (whether inside or outside the United States, including Anthropic's own foreign-national employees) to both Fable 5 and Mythos 5.

The government's stated concern was a potential jailbreak of the Mythos model. According to reports, Amazon's security team flagged a jailbreak in Fable 5 to the White House, triggering the export control action. Anthropic disputed the severity of the reported jailbreak, stating that it involved "identifying a small number of previously known, relatively minor vulnerabilities" that other models could also discover without bypassing safety mechanisms.

In its response, Anthropic stated: "We reviewed the report and confirmed that the capability level demonstrated is ubiquitous on other models (including OpenAI's GPT-5.5) and is used daily by defenders maintaining system security." The company argued that if this standard were applied industry-wide, it would effectively halt all frontier model deployments.

White House AI adviser David Sacks signaled the block might be temporary, writing that "the Admin's hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release."

As of this writing (June 15, 2026), Fable 5 and Mythos 5 remain suspended, while all other Claude models (including Opus 4.8, Sonnet, and Haiku) continue to operate normally.

Comparison with Competitors

vs. Claude Opus 4.8

Opus 4.8 is a strong model that remains in service as Fable 5's fallback. However, the 11-point gap on SWE-Bench Pro and the generational improvements in agentic capability make Fable 5 categorically superior for complex tasks. For routine work, Opus 4.8 at half the price may be more cost-effective.

vs. GPT-5.5

GPT-5.5 is a versatile, broadly capable model but falls significantly behind Fable 5 on software engineering benchmarks (58.6% vs. 80.3% on SWE-Bench Pro). ChatGPT offers broader feature coverage including image generation, video creation, and autonomous agents. For coding and agentic work, Fable 5 is clearly superior. For general-purpose multimedia tasks, GPT-5.5 remains competitive.

vs. Google Gemini 3.1 Pro

Gemini 3.1 Pro trails Fable 5 on most benchmarks (54.2% vs. 80.3% on SWE-Bench Pro) but offers superior Google Workspace integration and multimodal capabilities at a lower price point ($3.50/$10.50 per MTok). For Google-centric workflows, Gemini provides convenience. For raw capability, Fable 5 dominates.

Pros and Cons

Pros

State-of-the-art on nearly all benchmarks — 95.5% SWE-bench Verified, 80.3% SWE-Bench Pro
11-point lead over the next-best model on SWE-Bench Pro
1M token context window for massive codebases and documents
128K max output for generating substantial code and content
Exceptional agentic capability — Stripe's 50M-line migration completed in 1 day
Advanced vision — rebuilt web apps from screenshots, completed Pokemon FireRed
Self-verification — methodical, precise, verifies its own work
Persistent memory for long-running tasks
Free access through June 22 on Pro, Max, and Team plans

Cons

Double the price of Opus 4.8 ($10/$50 per MTok)
Mandatory 30-day data retention — no zero-retention option
Currently suspended by US government export directive
Safety classifiers silently downgrade ~5% of sessions to Opus 4.8
Overly aggressive safety filters on biology and chemistry topics
Adaptive thinking always on — cannot be disabled
Raw chain-of-thought never returned to API users
Significantly more expensive than GPT-5.5 and Gemini 3.1 Pro

Best Use Cases

Claude Fable 5 excels in specific high-value applications:

Long-horizon software engineering — multi-day autonomous coding tasks, large-scale migrations, complex refactoring
Agentic coding workflows — Claude Code, Cursor, and other AI development environments
Vision-heavy knowledge work — analyzing diagrams, charts, scientific figures, and PDF documents
Legal document analysis — record-breaking 13.3% on the Legal Agent Benchmark
Scientific research — long problem-solving sessions on difficult scientific problems
Enterprise codebase analysis — understanding and modifying massive codebases with 1M token context
Autonomous task execution — planning across stages, delegating to sub-agents, self-verification

Who Should Use Claude Fable 5?

Fable 5 is ideal for:

Software engineering teams working on complex, large-scale projects
AI developers building agentic applications that require sustained autonomy
Legal professionals analyzing lengthy contracts and documents
Researchers working on difficult scientific problems
Enterprises that need the highest capability model and can justify the cost

It may not be the best choice for:

Budget-conscious users — the 2x price premium over Opus 4.8 is significant
Privacy-sensitive organizations — the 30-day data retention requirement is non-negotiable
Users needing biology/chemistry assistance — safety filters may block legitimate queries
Casual users — Opus 4.8 or Sonnet provides better value for routine tasks

Final Verdict

Overall Rating: 9.6/10

Claude Fable 5 is, by the numbers, the most capable generally-available AI model ever released. Its benchmark performance represents a generational leap — not a minor version bump — with an 11-point lead on SWE-Bench Pro that dwarfs the gaps between other frontier models. Real-world deployments at Stripe, combined with endorsements from industry leaders like Andrej Karpathy and Cursor's Michael Truell, confirm that these benchmark results translate into genuine practical capability.

However, Fable 5 is not without significant caveats. The mandatory 30-day data retention policy is a dealbreaker for privacy-sensitive organizations. The silent fallback to Opus 4.8 on safety-flagged queries raises transparency concerns. The 2x price premium over Opus 4.8 is hard to justify for routine tasks. And the government suspension — while potentially temporary — has rendered the model inaccessible to most users as of this writing.

If and when Fable 5 returns to general availability, it will be the undisputed leader for agentic coding, long-horizon automation, and complex reasoning tasks. For organizations that can accommodate the pricing and data retention requirements, Fable 5 represents the cutting edge of AI capability.

Coding Performance: 9.8/10
Agentic Capability: 9.9/10
Vision: 9.5/10
Context Handling: 9.7/10
Safety Architecture: 8.5/10
Value for Money: 8.0/10
Availability: 5.0/10 (currently suspended)

Recommendation: The most powerful AI model available — when accessible. Monitor the government suspension status and prepare to adopt once access is restored. For users who cannot wait, Claude Opus 4.8 remains an excellent alternative at half the price.

---

Sources: Anthropic official announcement (June 9, 2026), InfoQ, TechCrunch, The Decoder, TokenMix Research Lab, Emergent.sh, TechJackSolutions, KeepingUpWith.ai, Claude API documentation.