±«Óãtv

Research & Development

Posted by Charlie Halford on

Within one week of launch, . By comparison, after it first launched to reach that same milestone. Tools like ChatGPT and others use a technique called generative AI, which can produce new text, images and other media by running a machine learning model fed by billions of existing bits of content from across the web and elsewhere. It is now possible to input a few lines of descriptive text (a "prompt") and have tools like Stable Diffusion or Midjourney create an image that has amazing fidelity and visual style. Many casual observers would not be able to tell whether it was generated by AI. Recently, someone created an AI-generated image of '' which many were convinced was genuine.

Théâtre D’opéra Spatial by Jason Allen

AI-generated media impacts our confidence in the authenticity of the media we now consume online. Now, more than ever before, we need to be able to answer the question: "Is this genuine?". Answering this question means knowing where an image came from and what has happened to it after the picture was created. For the past four years, the ±«Óãtv has been working with partners from across industries to produce an (the Coalition for Content Provenance and Authenticity). Its latest version, 1.3, also includes tools to help identify the origin of AI generated images. Let's dig into how it could help.

Théâtre D’opéra Spatial: An AI-generated example

Jason Allen submitted the AI-generated image above to the 'digital arts' category of the 2022 , and won.

from a text prompt submitted to , and was subsequently retouched in Photoshop and then upscaled in another service called . However, without this accompanying explanation, people might assume it was a piece of art created by a human. What the latest edition of the C2PA specification allows us to do is embed, in an image like this one, the details of its provenance and the series of steps the image went through before it was completed and submitted to the Colorado State Fair. The specification does this in a way that is cryptographically verifiable via a digital signature and tied to the image so that the details of its origin only apply to that exact image: they cannot be hijacked and copied to another image, and the image itself cannot be modified. That data can then be available at the point the image is viewed - giving consumers an immediate signal of where the image comes from. We'll examine how this can happen by looking at the adoption of C2PA in each step.

Step 1 - Generate the initial image

In our example, Midjourney was used to generate the initial image. The user provides a 'prompt' or a written description of the image the user wants to generate. We’ll use this prompt for the sake of our example: "futuristic figures look through a circular opening to a world outside, and are lit by it, in the style of a realistic oil painting". The actual prompt used by Mr Allen hasn't been revealed to the public yet.

When Midjourney prepares the images for display to Mr Allen, they could choose to use C2PA to embed some important data about how the image was generated and who generated it:

undefined

Step 2 - Edit in Photoshop

We've now got the critical information we need about where the original image came from: the source was an AI model, and the signature comes from Midjourney. Mr Allen's next step was to retouch the image in Photoshop. (in beta), and when that is enabled it records the details of the edits. Photoshop reads the C2PA data from step 1, keeps it as part of the image, and links to it as an input to the Photoshop step. This is important, as it preserves the origin / provenance of the first step, allowing consumers to see where the image was originally generated.

undefined

Step 3 - Enhance with Gigapixel

The last step of the creation process was to 'upscale' the image, to give it more detail (increased resolution) than the original. Upscaling is not as simple as downscaling, as the extra detail is not available in the image and must be extrapolated. The tool Mr Allen chose for this was Gigapixel AI, another generative AI service, but this time, one that took an image as input, and output an upscaled version, with extra detail inferred from its model.

In a C2PA-enabled version, Gigapixel could choose to add its own edits to the image, this time a 'resize' action done by an AI model, and some parameters that could have been chosen by Mr Allen when he ran the process. Here we've added the scaling factor of the image: 6x. Like Photoshop in the previous step, Gigapixel preserves the data from the previous two steps, and links to the last one, forming a three-step 'chain' of provenance:

undefined

Step 4 - Show the provenance

Finally, we can show users where the image comes from (and other provenance data). At this point, we have a very rich picture of editing steps and the original source of the image, but it's hidden away inside the image's metadata, and needs to be presented to users. For this, we need what the C2PA specification calls a 'validator': a piece of software that can ensure that all the technical provenance data (including the signatures and the binding to the image) is valid, and that the organisations signing the data has a valid certificate.

The validator is paired with a way to display this data to the user, typically as an overlay on the image, wherever the user views it. At this point, care needs to be taken and experimentation needs to be done to determine exactly . Too much, and users might ignore it. Too little, and we've not helped them answer the "Is this real?" question.

undefined

There are a number of examples of this user interface already available: , and . In the image below, TruePic have used the same set of products to produce , which provides a clear signal to consumers of the origin of the media.

undefined

Limitations

C2PA provenance is a technology that can help media consumers differentiate between real and 'fake' images. However, it is not without its limitations. For C2PA to work at scale, it will need widespread adoption so that media consumers come to expect to see provenance data as part of their images. As more and more people integrate media provenance into their content creation, users should become increasingly confident that they know where a piece of content came from, and more wary if its missing.

C2PA provenance was designed specifically with openness and choice in mind, and so the specification makes no effort to prevent malicious actors from stripping or removing existing provenance from media. The presence of provenance data can only prove that it was signed by a given person or organisation, it can't prove that a given signer had permission to sign it, or that it's the only person to sign it. Users will need to develop trust relationships with the organisations doing the signing, to decide if they believe their statements (or "assertions" in C2PA vocabulary). C2PA cannot prove truth; it is a mechanism for verifying who said what.

As an example of this, you could imagine a generative AI provider continuing to produce images without C2PA provenance, and a malicious person or organisation picking them up, signing the images themselves as the creator, and in the process asserting that the image is genuine. One way of counteracting this is to assume that the consumers trust relationship with that malicious person or organisation suffer would if something like this happened, and that they’d report it. Once the deception was uncovered the platform that hosted the content would then have extra tools to warn future consumers of the potential trust issue. So, if a social media platform receives reports that a particular creator (or signer in C2PA terms) was making false assertions, they could then warn users every time that signer posts content, or potentially even restrict or ban that content.

is ready for use now and is being actively improved. The specification is deliberately flexible and extensible and can enable free and commercial potential solutions to the limitations above. We really encourage contribution and discussion here!

With adoption continuing to grow, we could soon have an ecosystem where provenance helps reassure users that the content they see comes from the place it purports to and, more importantly, if it has been manipulated in some way. In a new world of AI-generated media, this is one of the most important tools we have in helping users understand who made their content, and how.

Topics