±«Óãtv

An Audio-Visual System for Object-Based Audio: From Recording to Listening

IEEE.ToM.20.8

Published: 17 January 2018

Authors

  • Jon Francombe (BMus Ph.D.)

    Jon Francombe (BMus Ph.D.)

    Lead Research & Development Engineer
  • Chris Pike (MEng PhD)

    Chris Pike (MEng PhD)

    Lead R&D Engineer - Audio
  • Frank Melchior

    Frank Melchior

    Previous head of audio research and audio research partner

Object-based audio is an emerging representation for audio content, where content is represented in a reproduction-format-agnostic way and, thus, produced once for consumption on many different kinds of devices. This affords new opportunities for immersive, personalized, and interactive listening experiences. This paper introduces an end-to-end object-based spatial audio pipeline, from sound recording to listening. A high-level system architecture is proposed, which includes novel audio-visual interfaces to support object-based capture and listener-tracked rendering, and incorporates a proposed component for objectification, that is, recording content directly into an object-based form. Text-based and extensible metadata enable communication between the system components. An open architecture for object rendering is also proposed. The system's capabilities are evaluated in two parts. First, listener-tracked reproduction of metadata automatically estimated from two moving talkers is evaluated using an objective binaural localization model. Second, object-based scene capture with audio extracted using blind source separation (to remix between two talkers) and beamforming (to remix a recording of a jazz group) is evaluated with perceptually motivated objective and subjective experiments. These experiments demonstrate that the novel components of the system add capabilities beyond the state of the art. Finally, we discuss challenges and future perspectives for object-based audio workflows.

Published in: IEEE Transactions on Multimedia (Volume: 20, Issue: 8, Aug. 2018). The paper is .

This paper was authored collaboratively with researchers from the  project. The full list of authors is: Philip Coleman (University of Surey), Andreas Franck (University of Southampton), Jon Francombe (University of Surrey), Qinju Liu (University of Surrey), Teo de Campos (University of Surrey), Richard J. Hughes (University of Salford), Dylan Menzies (University of Southampton), Marcos Simon Gálvez (University of Southampton), Yan Tang (University of Salford), James Woodcock (University of Salford), Philip J. B. Jackson (University of Surrey), Frank Melchior (±«Óãtv R&D), Chris Pike (±«Óãtv R&D), Filippo M. Fazi (University of Southampton), Trevor J. Cox (University of Salford), and Adrian Hilton (University of Surrey).

White Paper copyright

© ±«Óãtv. All rights reserved. Except as provided below, no part of a White Paper may be reproduced in any material form (including photocopying or storing it in any medium by electronic means) without the prior written permission of ±«Óãtv Research except in accordance with the provisions of the (UK) Copyright, Designs and Patents Act 1988.

The ±«Óãtv grants permission to individuals and organisations to make copies of any White Paper as a complete document (including the copyright notice) for their own internal use. No copies may be published, distributed or made available to third parties whether by paper, electronic or other means without the ±«Óãtv's prior written permission.

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: