鲍鱼tv

Multimedia Classification

Sam Davies | 07:45 UK time, Tuesday, 11 October 2011

As John Zubrzycki mentioned yesterday,听this project is running as part of 鲍鱼tv R&D鈥檚 Archive Research Section, is developing new ways in which to open up the 鲍鱼tv鈥檚 archive to the public. The aim of the project is to allow people to search and browse the archive to find content that they want to watch or listen to, but didn鈥檛 know existed.

Currently, the majority of people search for programmes on 鲍鱼tv iPlayer via programme title 鈥� they know the name of the show which they鈥檝e missed and stand a reasonably good chance of finding it. If they don鈥檛 have a specific programme in mind, then they can browse by channel or genre and pick something from there. However, when the archives are digitised if people were to browse in this manner, they鈥檇 have to wade through thousands of programmes to find what they want. There would be so many sitcoms or documentaries that finding a programme of interest would be a real challenge. 听

In order to allow people to search and find programmes more effectively we need what is known as metadata about a programme 鈥� information about a programme that can tell you something about it such as who is in it, what it鈥檚 about, where the different scenes or sections are.

Currently we have metadata that would allow you to find a programme if you knew either the title, when it was first broadcast and in some instances who were the main presenters or actors. But what if you didn鈥檛 want to search by these terms? What if you wanted an archived programme that was an exciting spy drama similar to Spooks, or a satirical comedy show about politics? In order to solve these types of questions we need new metadata.

In R&D, we are currently developing systems that can watch and listen to a programme in a similar way to people. 听We are developing systems that can recognise, and more importantly understand, what is in a programme (e.g. people, places, objects such as cars or Daleks), what these are doing (e.g. are character鈥檚 talking or shouting to each other? Is someone running? What are the characters saying to each other?) and what is the mood, or feeling of the programme. The mood element helps people find the programmes they want in order to be entertained 鈥� to match the mood of the programme to their current mood or desired mood.

In order to do these we are focussing on three main areas. The first is what we鈥檝e termed characteristics extraction. This is where we analyse the audio and video signals and try to identify key properties of it 鈥� such as cuts, motion, luminance, faces, any key audio frequencies or audio frequency combinations or especially strong or weak parts of the signal 鈥� using signal processing techniques such as the power spectral density or taking a Fast Fourier Transform of a section of the signal. This then gives us a set of numbers, or vectors, which represent the audio and video signals based on their key properties. 听

The next stage is what we鈥檝e termed feature extraction. Using the extracted characteristics, we aim to use them to identify key features, or objects in the programme. We do this using machine intelligence techniques which map the extracted characteristics to a library of characteristics taken from known features. 听These systems then make a decision as to which of the known features the extracted characteristics most closely match. For example, one of the initial systems we developed aimed to find studio laughter in a programme, which would help us identify initially which programmes were comedies and 听the position of jokes in a programme. To do this, we extracted the characteristics of hundreds of clips of studio laughter from different 鲍鱼tv comedies, creating what is known as a ground truth data set. We then extracted the characteristics from other programmes and matched them to this ground truth data set (using a technique called Support Vector Machines), and were able to identify how much laughter a programme had and where it is. 听We can then use similar techniques to identify other audio features, such as explosions, shouting, and cheering. We can also use broadly similar techniques to identify objects in the video, such as people, places and Daleks 鈥� by teaching the systems we鈥檝e developed what we are looking for, by showing them hundreds or thousands of examples they are then able to recognise those examples in any other programme we show them.

This helps us identify what is in a programme. The next stage of our research is to identify what types of mood and emotion a programme contains. This helps us classify programmes as exciting, tearful or happy. 听In order to do this we take a similar approach. We collected our ground truth by getting hundreds of people to watch hundreds of clips from the archive and tell us what the mood was at different times of the programme. We can then analyse the sections of programme which have powerful emotions and identify the key characteristics which we can use to train our systems. We can also use the extracted features to help identify a mood. In the above example, where we found lots of laughter we could infer the programme was happy and light hearted. 听If however there was lots of screaming and shouting we could infer the opposite.

We also look at other aspects of the programme. Music is an inherently important part of productions, helping set the scene and reinforce its emotion. Working with the University of Salford and the British Science Association, we ran an experiment called MusicalMoods (). This asked people to listen to TV theme tunes and rate their mood and emotion. We then use this data to identify the key elements of music which reflect the emotions (e.g. the key or tempo) and can then analyse other music. You can see a we did about the experiment on YouTube.听

In addition we are very interested in what is being spoken in the programme. 听In some instances we can use the subtitles or any available scripts. In conjunction with Dr. Andrew MacFarlane at City University London we are developing systems that can analyse these, identifying what people are talking about by analysing the actual words said and also their emotional content. We are also interested in how people are speaking 鈥� for example are they shouting, whispering or laughing as they speak? In many cases, we do not have either the subtitles or the scripts available. In these instances we are part of a research project run by the Universities of Cambridge, Sheffield and Edinburgh which aims to create new methods for automatically transcribing the speech of a programme into text.

Once we have collected all of this metadata about a programme, what's going on in it and its mood, the next area of our research is to develop systems that store and index this a useful way, and develop ways to allow people to search for what they want most effectively. Continuing our research with City University London, we are looking to develop a new type of information retrieval system. Traditional information retrieval systems aim to match a user鈥檚 query with any documents the system knows about. These tend to focus on key word matching 鈥� matching words or synonyms in a users query with those in the indexed documents. 听We are developing systems that not only perform this function with 鲍鱼tv programmes, but take into account the mood of the programme as well.

We hope that by creating these systems, 听people will find content in the archive that they didn鈥檛 know was in there and that they didn鈥檛 know they wanted. This will really help open up the archives and allow people to explore programmes within it.

Share this page

Comments Post your comment

Comment number 1.
At 11th Oct 2011, Kit Green wrote:

This is very interesting and potentially very useful to viewers, listeners and researchers.

How does it pay for itself? What is the business model?

Complain about this comment (Comment number 1)
Comment number 2.
At 13th Oct 2011, U14179821 wrote:

All this user's posts have been removed.Why?

Complain about this comment (Comment number 2)

听

This entry is now closed for comments

Jump to more content from this blog

About this blog

This is the Research & Development blog, where researchers, scientists and engineers from 鲍鱼tv R&D share their work in developing the media technologies of the future.

For the latest updates across 鲍鱼tv blogs,
visit the Blogs homepage.

Subscribe to Research and Development

You can stay up to date with Research and Development via these feeds.

Research and Development Feed(RSS)

Research and Development Feed(ATOM)

If you aren't sure what RSS is you'll find useful.

Other Related 鲍鱼tv Blogs

Mothballed Blogs

鲍鱼tv R&D Main Site

R&D 鲍鱼tvpage Image

For a detailed breakdown of our activities, teams, locations and how we collaborate visit our main website. We also host videos on the main website without UK only distribution restrictions.

Multimedia Classification

Comments Post your comment

Comment number 1.

Comment number 2.

About this blog

Subscribe to Research and Development

Other Related 鲍鱼tv Blogs

鲍鱼tv R&D Main Site

More from this blog...

Topical posts on this blog

Being Discussed Now

Archives

Categories

Latest contributors

鲍鱼tv navigation

鲍鱼tv links

鲍鱼tv

Multimedia Classification

Comments Post your comment

Comment number 1.

Comment number 2.

About this blog

Subscribe to Research and Development

Other Related 鲍鱼tv Blogs

鲍鱼tv R&D Main Site

More from this blog...

Topical posts on this blog

Being Discussed Now

Archives

Categories

Latest contributors

鲍鱼tv iD

鲍鱼tv navigation

鲍鱼tv links