Google’s DeepMind AI can lip-read TV shows better than a pro
NewScientist.com reports that artificial intelligence is getting its teeth into lip reading. A project by Google’s DeepMind and the University of Oxford applied deep learning to a huge data set of BBC programmes to create a lip-reading system that leaves professionals in the dust.
The AI system was trained using some 5000 hours from six different TV programmes, including Newsnight, BBC Breakfast and Question Time. In total, the videos contained 118,000 sentences.
First the University of Oxford and DeepMind researchers trained the AI on shows that aired between January 2010 and December 2015. Then they tested its performance on programmes broadcast between March and September 2016.
By only looking at each speaker’s lips, the system accurately deciphered entire phrases, with examples including “We know there will be hundreds of journalists here as well” and “According to the latest figures from the Office of National Statistics”.