Brief IA

Bankrupt Startups: The Digital Gold of Internal Archives for AI

🤖 Models & LLM·Tom Levy·

Bankrupt Startups: The Digital Gold of Internal Archives for AI

Bankrupt Startups: The Digital Gold of Internal Archives for AI
Key Takeaways
1Startups in liquidation are selling their digital archives, including emails and Slack messages, for up to $100,000.
2SimpleClosure and Sunset are capitalizing on this market by transforming these data into resources for AI labs.
3The shortage of public data is driving labs to seek real-world examples of work to train their AIs.
💡Why it mattersThe resale of internal data raises ethical and legal questions about privacy and the rights of former employees.
Le brief IA que lisent les pros

Le brief IA que les pros lisent chaque soir

Les 7 actus IA du jour, décryptées en 5 min. Gratuit.

Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.

Choisis ton rythme

Gratuit · Pas de spam · Désabonnement en 1 clic

📄
Full Analysis

A Lucrative Market for Data from Bankrupt Startups

Liquidating startups are finding a new source of revenue by selling their digital assets, such as Slack message archives, Jira tickets, and emails. These data can fetch up to $100,000 and have become a valuable resource for liquidation companies, which transform them into raw material for artificial intelligence labs.

SimpleClosure, a company that manages the dissolution of startups, launched a platform called Asset Hub in April 2026. This platform allows founders to license their digital archives, including source code and internal communication histories. According to Dori Yona, CEO of SimpleClosure, this activity represents a true "gold rush." In one year, the company processed nearly one hundred transactions, generating over one million dollars redistributed to the founders.

A competitor of SimpleClosure, Sunset, also operates in this market with similar pricing. Sunset particularly values sector-specific data, especially those related to health or finance, as well as well-interconnected histories across different platforms.

The Growing Demand for Real Data

Since the end of 2024, the demand for real data has increased, especially after former OpenAI Chief Scientist Ilya Sutskever highlighted the depletion of publicly available data on the internet. AI agents require concrete examples of work, with its imperfections and frictions, to train effectively. Synthetic data, being too perfect, does not allow for proper calibration of models in real-world professional environments.

This situation has given rise to a sector dedicated to "reinforcement learning gyms," where simulated environments are created from the archives of real companies. Startups like AfterQuery sell these turnkey "worlds" to labs, featuring environments like Big Tech World or Finance World. Anthropic, for instance, was considering investing up to one billion dollars in this area, according to Forbes. Other companies, such as Scale AI, Surge, and Mercor, are also diving into this promising market.

Legal and Ethical Issues

Legally, employees generally have no rights over this data. According to Slack's terms of service, the employer, referred to as the "Client," owns all data produced within the workspace. However, Marc Rotenberg, founder of the Center for AI and Digital Policy, believes that this data is personal and identifiable, and that transferring intellectual property rights does not resolve the issue of resale to third parties.

Marc Rotenberg's organization has sent a letter to the U.S. Senate requesting that the FTC strengthen its oversight of these practices. Companies purchasing this data claim to take anonymization seriously, but the process remains technically complex and imperfect. A 2020 study by OpenAI and Google showed that large language models can memorize sequences of training data, which can be extracted through appropriate prompts.

Some companies, like cielo24, have already sold their archives, generating hundreds of thousands of dollars. Shanna Johnson, former CEO of cielo24, stated that she received hundreds of thousands of dollars for thirteen years of her company's internal data. However, Bobby Samuels from Protege reminds us that there is no technical solution to instantly remove the personal footprint of an entire career from a dataset.

Brief IA — L'actualité IA en français

L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.