Talkie-1930: The Revolutionary Retro Language Model

Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
A Language Model Inspired by the Past
The language model talkie-1930-13b-base, weighing 53.1 GB, stands out with its 13 billion parameters. This model has been trained on a vast corpus of 260 billion tokens, exclusively consisting of historical texts in English, all predating the year 1931. This unique approach aims to explore the capabilities of language models based on data that has long escaped copyright restrictions.
In parallel, the model talkie-1930-13b-it, with a size of 26.6 GB, has been specifically fine-tuned. This fine-tuning process relies on an innovative dataset composed of instruction-response examples extracted from reference works dating back to before 1931. This model is designed to power a chat interface, enabling smoother and more contextual interaction with users.
License and Data Access
Both models are available under the Apache 2.0 license, ensuring free and open use. The training data for the base model, having fully entered the public domain, offers a unique opportunity for researchers and developers. The copyright deadline in the United States, set for January 1, 1931, allows this data to be used without legal restrictions. It is hoped that the creators of talkie will also consider publishing this training data to enrich research in this field.
Research Goals and Challenges
The report accompanying these models highlights fascinating research goals. Among them, the ability of models to predict future events is particularly intriguing. For example, researchers have assessed the degree of surprise experienced by a 13 billion parameter model when faced with descriptions of historical events, all derived from texts prior to 1931.
Another question raised is whether these models can invent concepts that exceed their initial knowledge. A famous inquiry posed by Demis Hassabis is whether a model trained up to 1911 could, independently, discover general relativity, as Einstein did in 1915.
Teaching and Programming
Can models be trained to program? This question has been explored by testing the ability of models trained on texts from before 1931 to write new correct programs in Python after receiving a few examples. Figure 3 of the report illustrates an early example of this type of test, demonstrating the potential capabilities of these models in the programming domain.
Vegan Models and Data Ethics
Interest in vegan models, meaning those entirely trained on licensed or public domain data, is a topic of debate. The base model of talkie appears to conform to this ethic, although the chat model is not entirely pure due to its reliance on non-vegan models for fine-tuning.
Data Generation and Optimization
To refine the model, instruction-response pairs were generated from structured historical texts, such as etiquette manuals, cookbooks, and encyclopedias. The base model was then fine-tuned on this data using a simple chat format.
To enhance the model's ability to follow instructions, synthetic prompts were created, covering a variety of tasks such as summarizing documents or responding to information requests. Direct preference optimization was conducted online, with Claude Sonnet 4.6 serving as the judge for the generated results.
Fine-Tuning and Technical Challenges
Another round of supervised fine-tuning was conducted, this time on multi-turn synthetic chats, sampled by rejection between Claude Opus 4.6 and talkie. The goal was to correct persistent imperfections in the model's conversational abilities.
A major challenge in training talkie has been avoiding accidental contamination by texts postdating 1931 or by introducing anachronistic knowledge through the assistance of modern LLMs in the fine-tuning process.
Towards Total Autonomy
The team behind talkie aspires to transcend these limitations. Although reinforcement learning with AI feedback inevitably influences the model in an anachronistic manner, they hope to use their vintage base models as judges for a fully autonomous and era-appropriate post-training pipeline.
Practical Test and Curiosity
As a test, a demo of talkie was conducted with a classic prompt: Generate an SVG of a pelican riding a bicycle. The model generated an image dating back to 1860, depicting a pelican perched on a saddle, with its beak pointed forward and its feet on the handlebars, inspired by observations of pelicans fishing while riding along the banks of the Rhine.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.