Case Study: IBM’s Smarter AI Assists Anchorage’s CBS Affiliate, KTVA, With Real-Time Captioning
Denali Media was formed in 2012 by General Communication Inc. (GCI), a telecommunications corporation in Alaska. Denali Media produces full HD content and delivers it over cable, broadcast, web and mobile platforms. Denali Media acquired its flagship station, KTVA in Anchorage, rebuilding its entire operation into one of the country’s premier broadcast networks. The company has 100 employees and currently operates six stations and more than 50 ad-insertable cable networks throughout Alaska.
A high bar for speech recognition accuracy
The world of closed captioning is changing. Required by law for broadcasters to serve the needs of the deaf community, captions have found audiences well beyond that purpose. With the rise of digital content, often viewed on mobile devices in public environments, captioned information aids focus and comprehension. From restaurants to airplanes to living rooms, viewers around the world rely on captions to better understand everything from quick-paced dialogue to critical weather alerts that keep them out of harm’s way.
Known as “The Voice of Alaska,” KTVA is a CBS affiliate in Anchorage that produces live daily news and weather broadcasts as well as specialized programming that highlights the state’s people, culture and history. For several years, the station had been searching — without success — for an automated speech recognition solution that could reduce the cost and complexity of captioning for its programs.
“We were looking for an accuracy rate of 90% or higher,” says Erik Kuhlmann, Director of Engineering and Operations at Denali Media Holdings, KTVA’s parent company. “And that’s a fairly high bar for automated speech recognition technology.”
The first system the network explored had an accuracy rate that reached only the mid-60th percentile. Beyond that, there were issues with punctuation and capitalization—as in, there was none. Because of U.S. Federal Communications Commission (FCC) regulations requiring accuracy in closed captioning, and viewer needs, KTVA knew it had to do better—or face potential fines and lawsuits.
KTVA waited a few years and then tested a second solution, hoping to take advantage of advancements in automated speech recognition technology. With the second system, the network saw an accuracy rate of about 80%, still well below its goal. To give the system the best possible opportunity to decode and translate text correctly, KTVA used clean audio — no background noise or music — but saw little improvement.
Kuhlmann says getting place names correct in a state like Alaska can be very challenging for captioning technology. The state’s diverse landscape has many indigenous villages and communities — Talkeetna, Wasilla, and Kenai, for example — so KTVA needed a solution that could be trained with a glossary of specialized, market-specific terminology—and one that used machine-learning technology to continuously improve over time while also meeting KTVA’s high standard of more than 90% accuracy.
Watson changes everything from week one
Kuhlmann heard about the Watson Captioning Live solution, which uses AI to deliver accurate and automated live closed captions. He requested a demo, and KTVA quickly discovered it had found the right solution.
The company spent a few weeks training the Watson solution to identify and learn the region’s idiosyncratic place names, many of which had to be phonetically taught. Watson starts out at a kindergarten learning level, explains Kuhlmann, and within weeks had advanced to a 12th-grade or higher learning level.
“Within the first week and a half, we exceeded the accuracy of the other systems that we had tested,” says Kuhlmann. “And it continues to advance, every day — which is a huge benefit in having AI behind the captions that we’re receiving from Watson.”
The ability to train the Watson Captioning Live technology ahead of deployment is a key differentiator as it allows the system to determine contextual-based content. During its initial training sessions, KTVA compared scripts of live broadcasts against the Watson solution, providing corrections and pronunciations for words Watson was stuck on after each one. Within a few hours the errors were corrected, Kuhlmann reports. “We saw a huge improvement after every newscast,” he says. “After the training, Watson was hitting the words every time.”
One important difference between the Watson Captioning Live solution and the other systems KTVA had tested is that the previous systems relied heavily on script data—reading the broadcast scripts and inferring words at the appropriate times to create the captions. Kuhlmann says the station tried that approach and found the results were actually less accurate because of misspellings and mistakes in the scripts. “It actually hurt us more than helped us,” he says. Watson Captioning Live technology doesn’t use script data, and as Kuhlmann attests, “We rely solely on Watson’s AI abilities to contextually determine what the captions look like.”
KTVA is required by law to caption news and weather reports. The network had been using Electronic Newsroom Technique (ENT), a service offered by the FCC that converts dialogue included on a teleprompter script into captions. Weather broadcasts pose a challenge as much of the information is ad-libbed and not included in a script, resulting in a poor closed captioning experience. For special programs without scripts, KTVA used a manual captioning service. But with that service, the network had to schedule captioning for different show segments each day. “It was cost-prohibitive for us to manually caption every weather segment in every show,” says Kuhlmann, “but we weren’t really providing a good service for our viewers with ENT.”
The Watson Captioning Live solution has completely solved this challenge as it not only provides cost efficiency but also streamlines the network’s workflow. “We no longer have to schedule manual captioning, and our weather folks don’t have to worry about creating a script for every one of their weathercasts,” says Kuhlmann. “The solution handles that perfectly every single time. Watson Captioning Live removes the onus of having to think about closed captioning altogether for the staff.”
New capabilities bring new opportunities
The Watson Captioning Live solution has provided KTVA with an automated and highly accurate captioning solution that continues to learn and improve. The IBM Cloud solution provided a seamless implementation, and within weeks, it had surpassed the accuracy levels of the other systems the network had explored. The solution has streamlined workflows for producers and staff as it alleviates the need to schedule human captioners and create scripts for weathercasts and special programming.
Importantly, training Watson was not only straightforward but also fun, says Kuhlmann. And seeing the capabilities the system provides, and its potential, has opened new doors and possibilities. Alaska is known for unique sporting events like the Iditarod or the signature Mount Marathon, in Seward, which attracts racers from around the world. “We provide very high-level coverage for Mount Marathon,” says Kuhlmann. “It draws a lot of people in—a lot of viewers from remote locations via our website or live stream.” In the past, KTVA hasn’t been able to provide captioning for its Mount Marathon coverage. As the show runs several hours, Kuhlmann says it didn’t make financial sense without an automated captioning solution. But this year, with Watson Captioning Live technology, KTVA will be able to seamlessly provide closed captions for every moment, accurately and in real-time.
To learn more, click HERE.