Increasing trust in content: Media provenance and Project Origin

Posted by Laura Ellis on 24 Oct 2023, last updated 24 Oct 2023

The environment in which our news exists has changed. What used to be secure, immutable, and simple is now open and chaotic. Imagine, if you will, a plate of spaghetti. It might look complex, but each strand of spaghetti is a linear entity. You can see where each element starts and ends. Our flow of news used to be like that. Now, think of a bowl of trifle. A mass of different things all coexisting alongside each other – the jelly, the sponge, the fruit, the custard. Is that a piece of fruit? Or a piece of cake? Is it soaked in sherry? is that real whipped cream or is it from a can? This is the social media confection into which we distribute some of our most precious content. It’s big, it’s flamboyant… and it may not always be good for us.

Increasing trust in content: Media provenance and Project Origin

The ��tv, and others with an interest in preserving access to reliable news, began to worry about this a few years ago. We joined Microsoft, CBC Radio-Canada, and the New York Times to form - a body that aims to secure trust in news through technology. As the project progressed, we found fellow travellers and together set up the Coalition for Content Provenance and Authenticity () to work on a set of open standards which would allow content to contain provenance details. The work began with a range of organisations working to help develop and use media provenance signals to help audiences determine what they choose to trust.

Think of a piece of content you encounter 'in the wild'. It might possess all the branding and properties you'd expect to see from a media organisation, but it could have been falsified. It could have been completely synthesised - even to the extent of a digitally originated human in a fabricated setting. Or it might be a mix of synthetic and 'real' material delivered in the context and with the intent set by whoever has created it. Many news providers have seen attempts to spoof their content, so how can we, as media organisations, protect ourselves, our content and help our audiences understand what they're seeing?

There are two main ways a piece of content might tell you about itself:

It might contain signals that its originator has put into the content - stating its origin and possibly offering other metadata. This is 'declared' provenance.
It might contain artefacts that can be detected or 'inferred'; deepfake detection has always been a somewhat inexact science, but techniques do exist.

Broadly, there are three technologies that can support signalling the provenance of media:

Watermarks began life as visible 'stamps' applied to an image to show ownership - think of the fun-run or wedding photos you were sent - or the little rainbow in the bottom left hand of a DALL-E image. They're relatively easy to process or crop out. More sophisticated watermarks can now be embedded invisibly (to the human eye) into the pixels of an image - Google's new is an example of this.

Fingerprints represent the nature of a piece of content in an approximate, more searchable form. One example is YouTube's Content ID, which uses fingerprints to match audio to a database of registered content. If you've ever had a message to say “No, you can't put a Taylor Swift backing track on your holiday video”, this is how that has come about. Much of the digital rights management we encounter on the internet uses this kind of technology.

Cryptographically Hashed Metadata is a more inflexible but secure method of fingerprinting in that it forms a small and unique representation of the underlying data. If the data changes in any way, even by a single digital bit, the hash will no longer match the data. Protecting the integrity of the hash through a cryptographic signature is an effective way of keeping the integrity of the whole data - by proving the signature was a witness to the hash and checking the data still generates the same protected hash value. It's the basis of the Content Credentials icon which has recently been integrated by major brands and industry leaders, including Adobe, Microsoft, Publicis Groupe, Leica, Nikon, Truepic, and many more, bringing a new level of recognisable digital content transparency from creation to consumption.

The truth is that effective provenance, in this complex ecosystem, may well draw on a combination of these. Practitioners will want to be proactive in distinguishing their content by ensuring audiences can understand what they're seeing, and as Generative AI starts to take hold there's some excellent work on disclosure such as the '' work carried out by the Partnership on AI - work to which the ��tv contributed.

There's a strong appetite amongst news providers to protect their content. And there is evidence that adding provenance has a positive impact on trust in our content online. Work done by ��tv Research & Development and ��tv News to use C2PA signals is now being picked up and developed as a pilot by Origin partner Media City Bergen who, with the , joined the Origin consortium as members this year. The idea is to build a range of options through which organisations can employ provenance signals - directly at the point content is published and using functionality offered by manufacturers. Our in-house research team has found evidence that adding provenance to images increases trust in content amongst those who don't typically consume our content. We also found evidence that provenance evens out trust across a range of images we use (editorial, stock and user generated content).

One of the benefits of using provenance tools will be enhanced transparency. We can not only offer reassurance as to where something has come from, we can share details of how it has been edited or put together. , and being able to be even more transparent about where content has come from can go a long way to helping us achieve this.

��tv

Accessibility links

Research & Development

Increasing trust in content: Media provenance and Project Origin

Topics

����tv

Topics

��tv