±«Óãtv

Research & Development

Posted by Chris Pike on , last updated

This week the 148th Convention of the Audio Engineering Society (AES) is being held online, due to the COVID-19 pandemic. These events happen twice a year, bringing together industry professionals and academics working on the latest developments in audio engineering. Our audio team in the Immersive and Interactive Content section of ±«Óãtv Research & Development regularly attends these events to share the latest work we’ve been doing and to learn about exciting work from others. This week our team is presenting seven papers at this event. I think it’s a great set of work that captures the breadth what we’re doing in our team, so I thought I’d give an overview in this post.

Investigating user interface preferences for controlling background-foreground balance on connected TVs

Lawrence Pardoe, Hannah Clawson, Lauren Ward, Aimée Moulson, and Chris Pike

The paper is available from the  and also as ±«Óãtv White Paper 385.

This is part of our work on "next generation audio" systems, which will allow personalisation of the audio in our programmes in future by using an object-based approach. One of the key use cases is to improve the understanding of speech in our programmes, for example when a listener has a hearing impairment or is in a loud environment, by allowing them to adjust the relative level of the speech and the other sounds in the mix. We have tested this concept previously, most recently in last year's Casualty Accessible & Enhanced episode on the ±«Óãtv Taster website.

undefined

For this paper, we investigated the user experience of controlling audio personalisation while watching television. We ran two studies. First, a paper prototyping exercise with 11 participants (5 of those with mild-to-moderate hearing impairment) to broadly understand the most desirable features of object-based audio. The second study assessed the preferred level of control granularity for a foreground-background control with a group of 18 normal hearing participants. Participants trended towards preferring the more granular controls, with the graduated slider significantly preferable for the documentary and drama content presented. The qualitative data highlights that ease of use and clarity of purpose in controls is vital. When the situation around COVID-19 allows us, we plan to extend this work to run the second study also with listeners who are hard-of-hearing and to investigate the user experience for mobile and web platforms too.

A mixed-methods evaluation of preferences between binaural and stereo broadcast audio with experienced and inexperienced listeners

Alice Foster, Chris Pike, and Jon Francombe

Available from the  and also as ±«Óãtv White Paper 373.

We've been working on binaural audio for several years, aiming to give headphone users a more immersive 3D listening experience. In this time we have helped the ±«Óãtv to make over 150 programmes available with binaural audio. For a recent example, try this episode of Slow Radio, where producer Hugh Huddie made a binaural recording through the night in a Suffolk woodland. This year, we also launched a set of training resources with the ±«Óãtv Academy, to enable ±«Óãtv producers to create binaural content without our direct support. There is a giving an introduction to spatial audio.

undefined

Despite keen interest from programme makers and our audiences, formal listening experiments have not consistently shown an improvement in the quality of the listening experience. During her graduate project with our team, Alice ran online listening experiments to help us to better understand peoples' preferences between binaural and stereo material. Participants compared binaural and stereo versions of a range of programme clips from our archives (sport, music, drama, and documentary) and gave their preference between the two. We also asked them to explain the reasons for their preferences, which has given us more insight into the pros and cons of the binaural listening experience. We ran the experiment with two listener groups: a group of 13 ±«Óãtv producers with experience of binaural and a group of 20 members of the public who had no particular experience with binaural. The results showed that often there weren't clear preferences for either the stereo or binaural version. However, a thematic analysis of the participants' explanations showed that both listener groups often appreciated the improved spatial impression given by binaural. The experienced producers reported problems with the timbral quality of the binaural material, which the inexperienced listeners did not. Generally, inexperienced listeners used less specific language to describe the differences and more often could not hear any difference between the versions.

Although binaural isn't preferred by all listeners, the results suggest that the spatial enhancements of binaural are clearly appreciated. This confirms our opinion that we should ideally give listeners a choice between binaural and stereo, according to their preference, and we should use binaural where the 3D impression adds value to the storytelling. The findings have also highlighted areas for improvement in the production tools, and we are currently working on a more advanced rendering system.

Audio Device Orchestration

We are presenting three papers about our work on device orchestration, which is the concept of using multiple connected, synchronised devices to create or augment a media experience. It can be used to create interactive and immersive audio experiences for multiple listeners. We've written previously on this blog about our research into device orchestration, and in 2018 we shared an orchestrated experience called The Vostok-K Incident on ±«Óãtv Taster. These three papers present some of our more recent work on the topic.

undefined

Exploring audio device orchestration in workshops with audio professionals

Kristian Hentschel and Jon Francombe

Available from the  and as ±«Óãtv White Paper 375.

There are challenges in producing content using device orchestration because the reproduction system is highly flexible and not known in advance. We ran a series of six co-creation workshops with audio professionals to develop creative ideas and discover workflow issues. The participants developed several working prototypes using a beta version of our orchestrated audio production tool. The paper presents a thematic analysis of the ideas generated, which revealed workflow issues as well as use cases and content proposals. Common themes included gamification and bringing listeners together. The participants also suggested applications for augmenting sports, drama, entertainment, and educational formats.

undefined

Exploring audio device orchestration in workshops with young people

Jon Francombe, Kristian Hentschel, and Suzanne Clarke

Available from the  and as ±«Óãtv White Paper 376.

When considering the development of a new technology like device orchestration, it is important to account for the wants and needs of the target audience. This second paper describes two day-long workshops that we ran with 16- to 18-year-olds, to explore the concept of audio device orchestration. The workshops utilised a variety of ideation techniques, including warm-up exercises, idea generation exercises, and co-creation of prototypes. We performed a thematic analysis on the outputs of the workshops to explore the participants' attitudes to audio technology and device orchestration. The results suggested a strong desire for the positive application of technology and content, focusing on issues such as wellbeing and togetherness. The results match closely with previous ±«Óãtv R&D research on values for digital wellbeing and correlate well to the themes from the workshops with audio professionals.

undefined

Understanding users' choices and constraints when positioning loudspeakers in living rooms

Craig Cieciura (University of Surrey), Russell Mason (University of Surrey), Philip Coleman (University of Surrey), and Jon Francombe

Available from the  and also as ±«Óãtv White Paper 374.

Craig is a PhD student at the University of Surrey's , and ±«Óãtv R&D sponsors his project on audio device orchestration through an EPSRC ICASE award. This paper presents a study that was conducted in participants' homes to investigate how they would choose to enhance their existing audio system by positioning one to eight additional compact wireless loudspeakers. The eleven participants described three key themes, creating an arrangement that: was spatially balanced and evenly distributed; maintained the room's aesthetics; and maintained the room's functionality. In practice, participants prioritised aesthetics and functionality while often not achieving balance. Craig concluded that a hierarchy of preferred positions in each space exists, as the same positions were reused while positioning differing numbers of loudspeakers and by different participants in each location. There were consistent patterns in different rooms, which suggests that we can estimate likely loudspeaker positions for a certain living-room layout. Following on from this study, Craig is currently investigating how best to assign audio objects to connected devices in an orchestrated system.

Some of our device orchestration team have also recently been using similar technology to implement ±«Óãtv Together, which lets people watch or listen to ±«Óãtv on-demand programmes at the same time as family and friends. We are currently preparing our orchestrated audio production tool for release on the ±«Óãtv MakerBox platform, to allow others outside of the ±«Óãtv to experiment with the audio device orchestration concepts that we have developed. You can to find out more.

undefined

Augmented Reality for DAW-Based Spatial Audio Creation Using Smartphones

Adrià M. Cassorla (University of York), Gavin Kearney (University of York), Andy Hunt (University of York), Hashim Riaz (Abbey Road Studios), Mirek Stiles (Abbey Road Studios), and Damian T. Murphy (University of York)

The paper is available from the .

This paper describes work done by Adrià during his MSc at the University of York before he joined our team to work on the creative practice of audio device orchestration (in collaboration with York's Audio Lab). It was a collaboration with . He investigated the use of augmented reality (AR) capabilities on smartphones as a tool for the production of 3D audio. Using a custom iOS app that uses the ARKit framework and sends control signals to a digital audio workstation, he compared three methods of positioning sound sources in 3D space using an iPhone. User testing showed that the use of AR to control spatial audio is worth exploring further in future. However, for many users, practicality and usability are more important than the immersive aspects of the AR tools.

undefined

Content matching for sound-generating objects within a visual scene using a computer vision approach

Daniel Turner (University of York), Damian Murphy (University of York), and Chris Pike

The paper is available from the .

Like Adrià, Dan is a PhD student at the University of York's Audio Lab, and ±«Óãtv R&D sponsors his project on intelligent sound design through an EPSRC ICASE award. This paper presents an initial feasibility study. Production for immersive extended reality (XR) experiences places additional demands on sound design teams; scenes are often complex with a large number of sound-producing objects that need positioning in 3D space. Dan investigated the use of visual object detection within a simple 2D scene to detect, track, and match content for sound-generating objects. The approach taken worked successfully for a single moving object, automatically generating appropriate panning data, but there were limitations with the computer vision system for scenes with multiple objects. The system also recommended candidate sound effects to the user, based on matching the labels of the visual object detector to the metadata for the .

Following the study reported in this paper, Dan is now running surveys and ethnographic studies with professional sound designers to identify challenges in their working processes. He is also investigating the use of machine learning and advanced signal processing to create intelligent sound design tools that bring new creative possibilities to sound designers.

Both Dan and Adrià are supervised by Professor Damian Murphy, head of the at the University of York. Their projects are connected to the initiative, an exciting £15M investment as part of the AHRC's Creative Industries Clusters programme which aims to advance immersive and interactive digital storytelling.

Summary

I hope this gives insight into some of the work that we are doing in the Immersive and Interactive Audio Team. We will publish these as open-access ±«Óãtv White Papers in due course. We will also be updating our project pages on the website in the coming weeks so that they reflect our current work, as they've gotten a bit outdated.

-

±«Óãtv Taster - Casualty: A&E Audio

±«Óãtv R&D - 5 live Football Experiment: What We Learned


Immersive Audio Training and Skills from the ±«Óãtv Academy including:

Sound Bites - An Immersive Masterclass

Sounds Amazing - audio gurus share tips

This post is part of the Immersive and Interactive Content section

Topics