OpenAI and Google: Strengthening AI Content Traceability

⚡

Key Takeaways

1OpenAI and Google are collaborating to enhance the traceability of AI content through SynthID, an invisible watermark.

2C2PA compliance allows platforms to preserve and transmit provenance information for content generated by OpenAI.

3A new public verification tool will enable the detection of AI-generated content using signals like Content Credentials and SynthID.

💡Why it matters — Ensuring the traceability of AI content strengthens user trust and protects against misinformation.

Understanding the Provenance of AI Content with OpenAI and Google

In the current context where artificial intelligence tools, such as those developed by OpenAI, are used daily to create and modify images and audio files, the question of the origin of this content becomes crucial. Content Credentials and SynthID are technologies that allow us to understand where AI-generated content comes from. These tools provide essential context about the creation and modification of media, thereby enhancing user trust. Provenance signals provide detailed information about the origin of the content, how it was created or modified, and whether it matches what it claims to be.

OpenAI recently announced significant advancements in this area by implementing a multi-level provenance model. This model aims to establish a more robust online trust by facilitating the recognition of provenance signals by other tools and platforms. By complying with the C2PA standard, OpenAI adds a durable SynthID watermark to images, in partnership with Google, and offers a preview of a public verification tool for users. These updates build on OpenAI's previous work to support open standards and facilitate the identification of content generated by its tools.

C2PA Compliance: A Pillar for the Trust Ecosystem

Since 2024, OpenAI has been actively engaged in the development and adoption of provenance standards. The company began by integrating Content Credentials into images generated by DALL·E 3, followed by ImageGen and Sora. By joining the Steering Committee of the Coalition for Content Provenance and Authenticity (C2PA), OpenAI is participating in the development of an open technical standard for content provenance. The C2PA standard uses metadata and cryptographic signatures to ensure that information about a media file travels securely with the content itself. This information is crucial for journalists, platforms, and users, as it provides context about the source and integrity of online content.

We recently reached an important milestone by making OpenAI compliant with C2PA. This compliance allows platforms to read, preserve, and transmit the provenance information attached to content generated by OpenAI. This is essential because provenance only works effectively if it survives beyond the first platform where the content is created. C2PA compliance ensures this continuity, allowing provenance information to remain intact and accessible.

The Multi-Level Approach with Google’s SynthID

C2PA metadata is essential for provenance, but it is not infallible. It can be removed or altered during transformations such as file format changes or screenshots. To enhance the resilience of provenance, OpenAI adopts a multi-level approach by integrating Google DeepMind's SynthID watermark. This invisible watermark is applied to images generated by ChatGPT, Codex, or the OpenAI API. SynthID complements C2PA metadata by providing an additional layer of protection that is more resistant to transformations.

We have been working towards this integration for some time, using visible watermarks in Sora and an audio watermark in Voice Engine. The two systems complement each other: C2PA provides detailed context, while SynthID ensures the durability of the provenance signal. Together, they make provenance more robust than each method taken in isolation. The SynthID watermark is particularly resistant to transformations such as screenshots, while C2PA metadata can offer more detailed information than a watermark alone.

A Public Verification Tool to Detect AI Content

For provenance signals to be useful, it is essential that users can detect them. OpenAI is offering a preview of a public verification tool that will allow users to check if an image was generated by its tools by detecting the presence of Content Credentials and SynthID. This tool aims to simplify the verification and interpretation of provenance by integrating multiple signals to answer the question: "Was this content generated by AI?"

This tool builds on insights gained from the first research preview of our image detection classifier in 2024. Although detection is not infallible, OpenAI takes a cautious approach, avoiding definitive conclusions in the absence of signals. If no metadata or watermark is detected, the tool will not automatically conclude that the image was not generated by OpenAI. Initially limited to content generated by OpenAI, the tool may eventually expand to other platforms and types of content.

Towards an Interoperable Provenance Future

No single provenance technique can suffice on its own. OpenAI is banking on a combination of shared standards, durable watermarks, and public verification to build a more interoperable provenance ecosystem. By strengthening Content Credentials, adopting C2PA compliance, integrating SynthID, and developing public verification tools, OpenAI hopes to contribute in the long term to a digital environment where the provenance of content is more reliable and transparent. This approach aims to create an information ecosystem where users can trust the origin and integrity of the content they encounter online.