±«Óãtv

Roar To Explore

Voice Classification for Children's Content Navigation

Published: 1 January 2011

Allowing young children to explore natural history content by making animal sounds

What we are doing

We developed prototype software for a user-experience concept which allows young children to explore online content by making sounds with their voice.

Why it matters

Very young children are unable to navigate through ±«Óãtv content online without the help of adults. Speech recognition technologies are improving and are able to provide an accessible control interface for those who cannot use a keyboard, but these systems do not work well for young children. Roar To Explore allows a child to find information about an animal by making the sound of that animal, for example to get pictures and videos of lions the child would roar into the microphone.

Our Goals

The objective was to build a prototype that proved the technical concept and gave an idea of the user experience.

Outcomes

A prototype web application and mobile application were built to demonstrate the technical concept and stimulate ideas about the user experience. Several algorithms were tested objectively and part of the work was published in a paper at the 132nd Convention of the Audio Engineering Society.

How it works

Software was written to classify the sound recording of a child making an animal noise. A library of example animal sounds made by children was created and labelled with the correct animal. This was then analysed using VAMP audio analysis plug-ins to get a set of features which describe the range of sounds to expect for each animal. Each recording was represented by a Gaussian model of the Mel-frequency cepstral coefficient (MFCCs) distribution. Using this data a support vector machine (SVM) was trained to predict which animal is being impersonated. During the project we investigated the effect of training the model with data from a single user against the effect of using a generic model trained on data from many users. We also considered a variety of different audio signal features and classification algorithms, as well as effect of choosing different sets of animals on performance of the software.

Project Team

  • Chris Lowis (PhD)

    Chris Lowis (PhD)

    Senior Research Engineer
  • Yves Raimond (PhD)

    Yves Raimond (PhD)

    Senior R&D Engineer
  • Vicky Spengler

    Vicky Spengler

    Senior Creative Director
  • Chris Pike (MEng PhD)

    Chris Pike (MEng PhD)

    Lead R&D Engineer - Audio
  • Immersive and Interactive Content section

    IIC section is a group of around 25 researchers, investigating ways of capturing and creating new kinds of audio-visual content, with a particular focus on immersion and interactivity.

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: