Hugging Face Optimizes Its Publications with AI and Open Source

⚡

Key Takeaways

1Hugging Face now uses an automated workflow for weekly releases, integrating AI and open-source tools.

2The process combines automation and human intervention to ensure the accuracy and relevance of release notes.

3The approach aims to be adopted by other maintainers due to its transparency and simplicity.

💡Why it matters — This method could transform software version management, making the process more efficient and accessible.

A New Era for Hugging Face Publications

Hugging Face, a major player in the field of artificial intelligence, has recently transformed its software publishing process. The core of this transformation lies in the use of the Python client huggingface_hub, which serves as the foundation for the company's ecosystem. This client is essential for many libraries such as transformers, datasets, diffusers, and sentence-transformers. These libraries rely on the Hub for effective communication. Until recently, Hugging Face published new versions every four to six weeks. Now, thanks to an automated workflow via GitHub Actions, publications are weekly. Each week without a new release is a week of fixes and features stalled on the main branch. This change has been made possible through the use of open-source tools and open-weight models, while maintaining human intervention where judgment is crucial. The goal was to create a system that other developers could easily adopt and adapt without requiring contracts with vendors or complex infrastructures.

An Initially Manual Process

Hugging Face's previous publishing process was largely manual, although it incorporated some automated elements. When a tag was pushed, a publication on PyPI was triggered. However, many steps remained manual. This included creating version branches, incrementing the version in the init.py file, committing, tagging, and pushing. Additionally, there was a need to monitor downstream CI executions and sort failures, read each merged PR since the last release, and manually write release notes. These notes had to be organized by theme, with appropriate context, and not resemble a simple git log. After the RC period, the stable version was published, followed by an internal announcement on Slack and posts on social media. Finally, a post-release PR was opened to increment the main branch to the next dev0. Writing release notes was particularly time-consuming, requiring the compilation of dozens of PRs on various topics, which represented half a day's work spread over several days.

Automation and Human Intervention

To streamline this process, Hugging Face identified two types of tasks. Some steps, purely mechanical, could be automated: version incrementing, committing, tagging, pushing, opening downstream test branches, and opening the post-release PR. These tasks do not require thought and must simply be executed in the correct order, which is one of the advantages of a CI workflow. Other tasks, such as writing release notes and formulating announcements, require human judgment. This is where AI comes in, transforming a blank page into a solid first draft in seconds. However, it is crucial to remain vigilant, as an incorrect draft can be more problematic than no draft at all.

An Open and Reusable Design

During the redesign of the process, Hugging Face established a clear constraint: every part of the system had to be accessible and reusable by any maintainer. No closed templates or proprietary platforms. The tech stack includes the orchestration of the publication, the execution of the agent that drives the model, an open-weight model (currently GLM-5.2 from Z.ai), the writing of release notes and the Slack announcement, HF inference providers, and the publication of the package on PyPI. A key principle is that the model writes, but a human decides. Language models are effective at transforming PR titles into readable notes, but they should not be used blindly. The workflow is therefore supervised by a human: the model produces a first draft, a script checks its work, and a human reviews before any publication.

Overview of the Pipeline

The complete workflow is encapsulated in a single file, .github/workflows/release.yml, manually triggered from the Actions interface. It requires a single input:

workflow_dispatch:
- minor-prerelease

The tasks then proceed in the following order:

Preparation: Calculation of the next version, creation or reuse of the publication branch, incrementing version, committing, tagging, and pushing.
Publication on PyPI: Building and uploading huggingface_hub. In parallel, building and uploading the hf CLI as a separate PyPI package.
Release Notes: Broadcasting the commit range since the last tag, extracting PR metadata from the GitHub API, and writing a structured changelog by the model. This draft is saved as a GitHub publication.
Downstream Test Branches: For RCs, opening branches in transformers, datasets, diffusers, sentence-transformers with the RC blocked, to quickly check if something has broken.
Slack Announcement: Reading the notes and producing an internal announcement in the team's tone.
Archiving Notes: Uploading the raw AI draft and the human-edited version to a Hugging Face Bucket.
Post-Publication Increment: After a stable release, opening a PR on the main branch to increment to the next dev0.
Commenting on Shipped PRs: Leaving a comment "this has been shipped in vX.Y.Z" on each PR in the release.
Synchronizing CLI Documentation: Opening a PR on the skills repository with regenerated hf CLI skills documentation.
Reporting to Slack: Each step publishes its status as a thread response; a final task updates the root message with ✅ or ❌.

The remaining manual steps consist of reviewing and publishing the release notes drafts, and reviewing and publishing an internal message on Slack. These two steps require human intervention.

Human Verification: An Essential Element

A major issue with AI-generated release notes is the risk of omitting or inventing PRs. An almost correct changelog is worse than no changelog, as it is unverified. Hugging Face does not trust the generated notes to be complete on the first attempt. A Python script retrieves all relevant PRs and stores them as a reference. The model then writes the notes based on these PRs. Once completed, the output is checked against the initial list of PRs. If items are missing or excessive, the process does not publish an incorrect file. The divergence is sent back to the agent for correction.

Anchoring the Model to Avoid Errors

Completeness and accuracy are essential. A model that summarizes a PR solely from its title may invent incorrect code examples. To prevent this, the PR metadata also includes documentation diffs. This diff is integrated into the model's context to ensure that the cited examples match the actual documentation. The prompts are stored as skills, verified in the repository, and explain how to structure the release notes.

The Crucial Role of Humans

After the RC publication, the GitHub publication draft remains available with the AI's first pass. A reviewer reads and edits the draft for tone and emphasis, correcting the model's errors. Only after this review is the minor release triggered, promoting the RC to final version. The reviewer's time is thus optimized, transforming half a day's writing into a fifteen-minute editing session. Two files are archived each week: the raw AI draft and the human-edited version, allowing for continuous improvement.

Security and Transparency

The redesign of the process has also enhanced security, particularly against supply chain attacks. The publication uses Trusted Publishing, eliminating the need for PyPI tokens. PyPI verifies a short-lived OIDC token issued by GitHub for this exact workflow and issues PEP 740 / Sigstore provenance attestations for each artifact. The agent's execution is locked and verified, ensuring the integrity of the process. Hugging Face has demonstrated that it is possible to combine efficiency, security, and transparency in an automated publishing process.

Minimal Resource Impact

The cost of this transformation is negligible. A complete publication, including notes and the Slack announcement, across 20 to 40 PRs, requires less than one hour of human work. This increased efficiency allows Hugging Face to focus on innovation and the continuous improvement of its products.