±«Óãtv

Research & Development

Posted by Matt Firth, David Marston on , last updated

The is one of those high profile annual live events that demands to be delivered with the highest quality production and using the latest technology. . So it felt appropriate to continue this tradition by using the event to experiment with the latest audio delivery technology. Having it hosted in the UK also made it an attractive event to perform some internal technical trials.

Image of Eurovision winner Loreen performing at Eurovision 2023 - Sarah Louise Bennett/EBU

Interactive, personalised, and immersive content

Over the past few years, we have been working with the wider industry to improve audio experiences for our listeners through interaction and personalisation of the audio presentation. For example, you may wish to listen to the narration in a different language; or reduce the level of background music. The experience can also be improved by using immersive audio which is a step-up from surround sound. The audio adapts to the available speakers on the output device and can envelop the listener in sound from all three dimensions.

To enable these features, we need metadata to describe the audio so that it can be correctly delivered and processed at the receiver end. To allow this metadata to be read by all systems in the broadcast chain, it needs to conform to a standard, and the standard used is called the . ADM metadata can drive Next Generation Audio (NGA) encoders such as AC-4 and MPEG-H to allow delivery of audio to the home and provide all the features we want.

The challenge of live broadcasting and the S-ADM

ADM metadata is usually carried with the audio, typically in the BW64 file format. This file-based approach is suited to non-live production, where the whole programme is produced in advance and delivered as a complete file. However, for live broadcasting, a file-based approach is not appropriate since we need to generate the ADM metadata in real-time for live distribution. This is where the Serial ADM (S-ADM) comes in. It is a frame-based version of the model, where the metadata is chopped into time-delimited chunks and delivered in sequence alongside the audio.

Until now, S-ADM has rarely been used in the production and broadcast chain because the technology has not fully developed yet. So this trial was to try to use S-ADM in a live scenario, and test interoperability with NGA encoders and production tools.

The test chain

We trialled the live production of audio using raw audio feeds from Eurovision in Liverpool, all the way through to domestic output devices in our lab. We performed the trial in our immersive listening room in Salford, and getting audio feeds from Liverpool was possible (but not trivial!)

The diagram shows audio being sent from the venue's mixing desk in Liverpool via London to Media City in Salford. We set up everything else in the diagram in our Media City lab. Dashed boxes show whether it was ±«Óãtv, L-Acoustic, or Dolby technology in use.

A diagram showing the test chain, explained in detail below.

The test setup

The audio was received in the listening room over MADI (Multichannel Audio Digital Interface) and consisted of 62 channels of a variety of different signals from the mixing desk at the venue. These included the music, presenters on stage, pre-recorded effects and an array of audience microphone feeds to give atmosphere. We generated S-ADM metadata to identify the purpose of each feed (or channel) and how to render it in the final user experience. This included gain information to describe how loud each feed should be in the mix, and position information to describe where each feed should be perceptually placed in the mix. This S-ADM metadata is carried as data in an additional audio channel to an encoder.

The NGA emission encoder delivers the content to consumer devices. Our 62 channels of audio with S-ADM metadata are too much for an NGA emission encoder, which are more limited in the number of channels they can support. Therefore, the channels had to be 'squeezed' down to no more than 15 audio channels, as well as S-ADM metadata containing the correct information for the emission encoder. We developed a squeezing processor to do this.

The squeezed down audio and S-ADM metadata were then sent to the Dolby encoding and playout system, where it was AC-4 encoded before being decoded by the consumer devices in our lab.

Production using ADM-OSC

The production from the 62 channels of audio was handled by L-Acoustic's L-ISA Controller and Processor software. The L-ISA Controller allows a sound producer to set position and gain parameters for each of the input audio channels (we call those audio objects) with a straightforward user interface. The Controller outputs real-time positional metadata using the Open Sound Control (OSC) protocol. We used the ADM-OSC flavour of OSC, which contains ADM-compatible messages for easy interfacing with ADM tools.

A screenshot of the L-ISA controller software, showing the placement of audio channels.

The L-ISA Processor receives the input audio and the ADM-OSC metadata, and can render the audio to suit a specified loudspeaker layout. This allows the producer to monitor the mix in fully immersive audio, and position the 3D location of the audio objects in real-time.

As well as the L-ISA setup, we also used our Live Binaural Production Tool. We originally built this for binaural productions of the ±«Óãtv Proms and adapted it to output ADM-OSC for this trial. Like L-ISA, it also allows audio objects to be positioned in 3D space.

A screenshot of ±«Óãtv R&D's live binaural tool, showing the placement of audio channels.

Either tool can connect to our S-ADM generator and squeezer, showing that any software that generates ADM-OSC can be used in the production set-up. We can also render the audio to the array of speakers in our listening room via the EAR (EBU ADM Renderer), using some additional software we developed to accept ADM-OSC in to EAR.

Interfacing with the Dolby AC-4 emission system

We needed to send our S-ADM metadata and 15 channels of audio to the Dolby AC-4 encoder, so a method of carrying that signal was required. The standards' body has been working on several different methods of carrying audio, video and metadata. For this trial, we used the ST337 and ST2116 standards, which allow S-ADM metadata to be carried as data in an audio channel. In our setup, the first 15 channels of our output carried audio with the 16th carrying the compressed S-ADM metadata.

An Ateme Titan Live encoder converted the ST337 input into a Dolby AC-4 bitstream, which was then delivered in a DVB-T transport stream for over-the-air TV reception. It was also sent via a DASH server (Dynamic Adaptive Streaming over HTTP) for internet-based delivery to both a smartphone and to a television supporting HbbTV (Hybrid broadcast broadband TV) in our lab.

This interfacing between our S-ADM generator and Dolby encoder system had never been tried before, so it was a pleasant surprise when it worked first time!

The presented programme

Among the 62 audio channels fed into the lab were the ±«Óãtv One and Radio 2 commentary feeds. This enabled us to provide the option to choose between the commentaries using the user interface on the TV or a smartphone. The rest of the audio (including the music, pre-recorded videos and audience noise) was mixed by the squeezer to a 5.1.4 channel layout, providing an immersive experience on capable devices. Since the television only had a basic pair of built-in speakers, it rendered the programme in stereo. However, by hooking up an Atmos-enabled soundbar to the television, we enjoyed a more enveloping experience demonstrating how the content can adapt to different devices.

The on screen display of a TV showing surround sound options that are available for the viewer during the Eurovision trial.

Conclusions and further work

These tests show that S-ADM can be used in live productions, enabling the use of NGA for broadcasting and providing listeners with an enhanced experience. One of our main aims was to test interoperability between different systems, and this was successful.

As we captured a lot of audio from the contest, we can repeat the tests with other systems, in particular the MPEG-H NGA codec, as well as S-ADM over IP using ST2110-41 (another SMPTE standard for delivering metadata). We also aim to develop the tools that generate the S-ADM metadata and perform the squeezing processes.

Topics