OpenAI GPT-5.6 Slowed Down by U.S. Restrictions

⚡

Key Takeaways

1OpenAI has unveiled GPT-5.6 Sol, surpassing Anthropic's Claude Mythos 5 in coding.

2The performance of GPT-5.6 Sol is hindered by restrictions imposed by the U.S. government.

3OpenAI expresses its dissatisfaction with these limitations, which are deemed unsustainable.

💡Why it matters — These restrictions could limit innovation and OpenAI's competitiveness against its international rivals.

OpenAI GPT-5.6 Sol Held Back by U.S. Restrictions

The new generation of OpenAI's GPT-5.6 models includes the flagship model Sol as well as two cheaper tiers, Terra and Luna.

Sol matches or surpasses Anthropic's Claude Mythos 5 in several benchmarks, with a clear advantage in agentic coding and better token efficiency in cybersecurity. Currently, access is restricted to a few selected partners, which OpenAI considers detrimental to developers and businesses.

OpenAI has expressed its frustration: "We do not believe that this type of government access process should become the long-term norm. It deprives users, developers, businesses, cybersecurity defenders, and global partners of the best tools they need."

GPT-5.6 also introduces a new layered naming scheme that closely resembles that of Claude. The number (x.6) indicates the generation, while Sol, Terra, and Luna are permanent performance tiers that can evolve independently. Sol is the flagship model. Terra matches GPT-5.5 at half the price. Luna is the budget option. Additionally, there is a "max" mode for deeper reasoning and an "ultra" mode that distributes complex tasks among parallel sub-agents.

Sol Outpaces Claude Mythos in Agentic Coding

OpenAI's benchmark figures place Sol ahead of Anthropic's Claude Mythos 5 in agentic coding. On the Terminal-Bench 2.1, Sol scores 88.8%. Sol Ultra reaches 91.9%, while Claude Mythos 5 stands at 88% and Fable 5 at 84.3%.

GPT-5.6 Sol Ultra dominates the coding benchmark Terminal-Bench 2.1 with 91.9%. Claude Mythos 5 scores 88.0%. The Google Gemini 3.1 Pro Preview trails behind with 70.7%.

Sol also shows advancements in biology. On GeneBench v1, a benchmark for genomics and quantitative biology, it outperforms GPT-5.5 (30% versus 22% at best) while using fewer tokens.

On ExploitBench, which tests AI agents' ability to find and exploit real security vulnerabilities in Google's V8 JavaScript engine up to full code execution, Sol matches the performance of Mythos Preview while using about one-third of the output tokens, according to OpenAI.

On ExploitGym, a benchmark developed by researchers from UC Berkeley in collaboration with OpenAI and other labs, all three GPT-5.6 models improve as the reasoning effort increases, indicating scalability potential with more computing power. Claude's figures for this benchmark are not yet available.

OpenAI describes Sol as its most effective cybersecurity model to date, but presents it as a defender rather than an attacker. The model is better at spotting and fixing vulnerabilities than at conducting full end-to-end attacks autonomously, according to the company. Mythos has succeeded in this in another benchmark.

In tests with Chromium and Firefox, Sol found bugs and exploitation primitives but never produced a complete autonomous exploit. OpenAI states that GPT-5.6 Sol is still below the "Cyber Critical" threshold in its preparedness framework.

Pricing, Availability, and Cerebras Launch in July

For one million tokens, OpenAI charges $5 for input and $30 for output from Sol, $2.50 and $15 for Terra, and $1 and $6 for Luna. The company has also revised its prompt caching system with explicit cache breakpoints and a guaranteed minimum lifespan of 30 minutes. Writes to the cache cost 1.25 times the normal input price. Reads from the cache still benefit from a 90% discount.

Given that Sol uses fewer tokens to match or surpass its competitors in several benchmarks, the effective cost per task could turn out to be lower than that of previous generations. This would counter the trend of rising prices for AI models with each new release, a frequent criticism lately, and would represent a competitive weakness against cheaper Chinese models.

In July, Sol is expected to be operational on Cerebras at speeds of up to 750 tokens per second.

OpenAI GPT-5.6 Slowed Down by U.S. Restrictions

Le brief IA que les pros lisent chaque soir

OpenAI GPT-5.6 Sol Held Back by U.S. Restrictions

Sol Outpaces Claude Mythos in Agentic Coding

Pricing, Availability, and Cerebras Launch in July

Brief IA — L'actualité IA en français