±«Óătv

IRFS Weeknotes #213

Published: 3 February 2016
  • Tristan Ferne

    Tristan Ferne

    Lead Producer

This sprint in R&D's IRFS team we worked on analysing Casualty, sentiment analysis, the web and TV, immersive video and atomising stories.

174 can you tell me your name 160 i'll see what i can do 142 give you something for the pain 136 what do you want me to

Most popular 5-word phrases

648 going to be all right 623 do you want me to 514 what are you doing here 465 can i have a word 367 i'm going to have to

Most popular 4-word phrases

1872 what are you doing 1809 i don't want to 1298 do you want to 1218 can you hear me 1216 are you all right

They reminded us of and .

Update: Andrew has extrapolated these. Using a simple n-gram ‘letter’ model (i.e. just looking at the probability of 1 letter given n-letters before) he can generate random Casualty-ness, like this...

Happy birthday!  Oh, Charlie!Who is it?  It's Duffy!

CAN we win the Night Shift.‹Let's shock him.‹As the most senior person on duty, Gareth Davies will be supervising all the doctors in the department?  That depends on us.  â€šIs he in his office?  I don't know, Kelly. I've given up asking.  â€š

Hi, I'm Helen.  I want your complexion.‹It's nothing wrong. I think I would have been going on?  Put her over here, quick! Hurry up! ..What are we goin'? We'll see what they're doing him a favour. OK.  Sats 98%, Pulse 100, BP 140/85, resps 25, sats 96, resps are 30. Pulse irregular. Like my barber.

Ad infinitum.

 

Analysing the web

This sprint the Discovery team has been looking at ways to analyse the sentiment of relatively long articles, as basic methods better suited to short sentences or social media posts tend to yield useless sentiment values. Our initial experiments with splitting articles into sentences and assessing the distribution of non-neutral sentences is proving promising, and we are reviewing similar approaches published in the past, such as (PDF). We also updated our seriousness analyser, sorted out some infrastructural stuff and sketched some new tools.

We have been working on ways to extract text from web articles using the DOM rendered by a headless browser (the open-source phantom.js) rather than HTML source, with some success - and the added benefit of the ability to generate screenshots at various resolutions. We still, however, face issues with questionable javascript-based redirects. Then Tim pointed us to a possible solution from .

"All curation grows until it requires search. All search grows until it requires curation." ()

Analysing media

Jana's speaker identification work has given some very good results, with significant improvements over the previous attempts. And Matt has been debugging an error we found in our Kaldi training on the new GPU machine. We've now managed to complete training of a new model with 3 times as much training data and it yields a measurable improvement in the system's performance.

Connecting TVs and radios

Chris has been documenting the MediaScape project as it wraps up - all the work done on device discovery, pairing, and authentication, as well as the overall architecture. And he joined a W3C Web and TV Interest Group conference call to kick off the “Cloud Browser” Task Force. "The Cloud Browser Task Force is a subset of the Web and TV Interest Group, whose goal is to discuss support for web browser technology within devices such as HDMI dongles and lightweight STBs (set-top boxes).". Libby is currently going around boring everyone she knows with set top box FACTS.

VR and 360 video

We've been improving the HTML5/VR music visualiser - working on procedural terrain generation, investigating new types of visualisation and adding a new scene for performance comparison. And Andrew helped facilitate some user testing in the North Lab with Middlesex University for a study that is looking at the experience of viewing 360 films on three different devices: laptop, phone & headset. And Zillah has been very busy running several VR pilots.

Atomising stories

Chrissy has been setting up things for the atomised news trial and working with Lei of ±«Óătv News Labs to investigate what data we can get out of ±«Óătv systems, while Lara has been tweaking the front-end of the prototype. Thomas joined the UX team and has been refreshing the design.

Andrew has been refining the design templates for our TV Story Explorer prototype and getting assets for other dramas. He has also been thinking about an -type service that could incorporate storylines and key moments. Alan also joined the team and is starting to think about how to parse scripts to extract story data.

Also

Tristan and Libby were presenting things at the ±«Óătv Data Day. The ±«Óătv College of Journalism .

We're  to run our software engineering team (1 year contract, based in central London).

Links

We’ve been discussing a few nice local (to us) exhibitions about data and the interwebs, such as (with  in particular), and .

Building an automated “sarcasm detector” remains one of our somewhat-jokey goals, and it looks as though

A for Python and R.

Two nice posts on  and  application design (via Ian Forrester)

 

Search by Tag:

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: