Brief IA

Ancestry: AI Revolutionizes Family Archive Digitization

💡 Use Cases·Tom Levy·

Ancestry: AI Revolutionizes Family Archive Digitization

Ancestry: AI Revolutionizes Family Archive Digitization
Key Takeaways
1Ancestry uses AI to accelerate the digitization of 71 billion family records across 88 countries.
2Since 2014, the company has been developing machine learning models to enhance transcription and facial recognition.
3By 2025, over 50% of Ancestry's records will be generated by AI, tripling the content growth rate.
💡Why it mattersAI enables Ancestry to efficiently manage a massive volume of data, transforming global genealogical research.
Le brief IA que lisent les pros

Le brief IA que les pros lisent chaque soir

Les 7 actus IA du jour, décryptées en 5 min. Gratuit.

Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.

Choisis ton rythme

Gratuit · Pas de spam · Désabonnement en 1 clic

📄
Full Analysis

Ancestry and AI: A Partnership for Archival Digitization

Since 2014, Ancestry has integrated language models into its processes to accelerate the digitization of family archives across 88 countries. This initiative, led by Chief Technology Officer Sriram Thiagarajan, includes advancements in facial recognition and handwritten note transcription. These technologies enable more efficient processing of historical documents, thereby facilitating access to valuable information for users.

A 42-Year Legacy of Data Collection

Over the past 42 years, Ancestry has amassed more than 71 billion documents, ranging from birth certificates to marriage licenses, from 88 countries. These documents have allowed for the creation of 148 million family trees. Historically, the collection and organization of this data were time-consuming tasks, requiring manual entry by employees and third-party vendors. The company’s international expansion, initiated in 2001 with the launch of a website in the UK, incurred significant costs. According to Sriram Thiagarajan, the time required to digitize these content-rich documents was a major obstacle.

The Impact of AI Under Thiagarajan's Leadership

Since joining as Chief Information Officer in 2017, Thiagarajan has played a key role in integrating machine learning and artificial intelligence at Ancestry. The company's acquisition by Blackstone for $4.7 billion marked a turning point, enabling the acceleration of digitization through AI. This technology has also facilitated the development of new tools for users, including facial recognition and handwriting systems. These innovations have transformed the way users interact with family archives, making the process faster and more accurate.

The Evolution of AI Models

In 2003, Jackson Reese joined Ancestry to lead digital imaging. At that time, the company had a small imaging department responsible for digitizing various historical documents. Reese quickly expanded his team to over 70 people, utilizing technologies such as microfilm scanners. By 2014, Ancestry began developing its own machine learning and computer vision models to read paper documents. This initiative evolved by 2016, allowing the company to create algorithms capable of efficiently processing complex documents.

The Integration of BERT and Model Improvement

With the introduction of BERT by Google in 2018, Ancestry was able to build more accurate data extraction tools. Experts reviewed documents before passing them to indexers for transcription. Ancestry's AI models, trained on this data, aimed for over 90% accuracy. However, several iterations were sometimes necessary to refine the models. This continuous improvement process has allowed Ancestry to optimize its systems to better meet user needs.

The Impact of ChatGPT and New Language Models

The arrival of ChatGPT in 2022 marked a turning point, opening new possibilities for Ancestry. OpenAI's large language models and others have accelerated the digitization of unstructured data. Ancestry now uses a mix of proprietary and open-source models to process nearly 200 languages with minimal iterative training. This multilingual capability is essential for a company operating on a global scale, as it allows for the management of archives from diverse cultural and linguistic contexts.

AI Features for Users

In September 2023, Ancestry integrated large language models for user-facing features. Face Match, a facial recognition tool, helps identify individuals in family photos. This innovative feature provides users with a powerful way to reconnect with their family history by identifying ancestors from old photographs.

An AI-Dominated Future

By the end of 2025, more than 50% of Ancestry's historical archives will be generated by AI. This technology has tripled the content growth rate, increasing from 800 million archives in 2021 to 5.2 billion new archives in 2022, and 18.6 billion the following year. Ancestry continues to innovate with external AI use cases, such as adding language translation to its transcription tool in 2026. These advancements reflect Ancestry's commitment to leveraging AI to enrich the user experience and facilitate access to family history.

Brief IA — L'actualité IA en français

L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.