GPT-5.5 Boosted by 23 Points with a Simple Markdown File

Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
A Markdown File Revolutionizes GPT-5.5
In a surprising experiment, a Markdown file of just 1,400 tokens significantly improved the performance of GPT-5.5. Without altering the model weights, a simple integration of this file into the context window was enough to raise the model's average across six benchmarks from 58.8 to 82.3. This increase of 23.5 points was achieved with a text file accessible via any standard editor.
The Concept Behind SkillOpt
The article then explores the concept of SkillOpt, which is based on the idea of treating a skills document in Markdown as a state that can be trained while keeping the target model unchanged. A more powerful optimization model is used during training to propose limited modifications, such as adding, deleting, or replacing content. These modifications are only accepted if they improve a predefined validation score, drawing inspiration from the principles of stability in gradient descent within the text space.
Impressive Results Across Various Benchmarks
The results of the study are based on 52 different combinations of models, benchmarks, and harnesses. SkillOpt proved to be the best or tied for the best in all these combinations. In particular, GPT-5.5 saw its direct chat performance improve from 58.8 to 82.3 (+23.5 points), with particularly notable gains on procedural and format-verified tasks, such as SpreadsheetBench.
Trained Skills in Action
The author describes the "trained" skills that result from this process. These include rules for checking structure, writing explicitly evaluated values, tracking state in embodied navigation, and anchoring responses to the correct row in a table. Interestingly, these improvements can stem from a few accepted modifications and a relatively small artifact size.
Reproducing the Process with SkillOpt
For those looking to reproduce this workflow, the article provides a practical setup. This includes installing SkillOpt, configuring the backends, running the training loop, and deploying by prefixing the learned Markdown to the model's context.
SkillOpt-Sleep: An Innovative Extension
The article also mentions SkillOpt-Sleep, a plugin-style extension that learns from a user's past transcriptions. This extension utilizes an offline consolidation loop validated by review and adoption, thus offering a new dimension to the model's learning.
Limitations and Perspectives
Finally, two limitations are addressed: the reliance on automatic scoring judges and the fact that optimization focuses on one document at a time. The article concludes by emphasizing that for procedural and verifiable agent tasks, training the document rather than the model proves to be a more reliable and cost-effective optimization method than traditional fine-tuning.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.