Dialog Intelligibility Could Be the Buzzword for 2024

SVG Summit speaker highlights successful demos of the relevant technology

A presentation for the SVG Audio Production & Distribution slate of panels at last week’s annual SVG Summit suggests that dialog intelligibility and audio description concerns will be at the forefront of technology advances in the coming year. According to a representative of one of the major networks, new features to be included with NextGen TV will reflect successful field demonstrations of ways to increase dialog intelligibility while giving consumers more control over that capability.

“I think that there’s a lot of awareness around the growing use of closed captioning due to poor dialog intelligibility — difficulty understanding dialog on television, whether it’s sports or other programming — for a varied number of reasons,” the speaker explained. “Some actions are going to be taken based on new features available in next-generation systems that have been created, the tools can be there, and it’s just a matter of putting the time in to improve the situation, whether it’s on the production side or whether the adjustment control ends up on the [consumer] side. It’s all coming together like this now because we’ve realized that television audio has changed: soundtracks especially have become much more cinematic. Content creators are chasing after the cinema experience, the big-cinema experience.”

The intelligibility meter in Steinberg’s Nuendo 11, deploying algorithms developed by Fraunhofer IDMT, provides an instrumental assessment of speech intelligibility.

Dialog intelligibility is a multifaceted problem, ranging from the way contemporary consumer TV sets are designed, usually with speakers on the bottom or the rear, to how program audio is mixed, live and in postproduction. One finding cited indicates that manufacturers spend more money on the shipping cartons for flat-panel television sets than they spend on the internal audio components.

Soundbars have helped the problem, mainly by positioning speakers facing listeners in homes. But a better solution is one that will allow consumers to alter the balance of sonic elements in a mix at their end, to reduce background sounds and enhance dialog elements over music and effects in a way that will make dialog more easily intelligible.

Solutions will include a technical document, developed with input from the AES and other organizations, that will provide information for audio-mixer engineers. Another helpful element is the ongoing development, by Fraunhofer IDMT, of a protocol and a product to objectively measure dialog intelligibility. Fraunhofer’s method is based on neural networks and enables an automatic and target-group–specific measurement of speech intelligibility across applications that can be applied in postproduction but also during the production process.

Another component that could contribute to this effort is Dolby’s automated immersive upmixing system, which will be able to render stereo and 5.1-surround mixes into a 5.1.4 format for broadcast, with attention paid to dialog intelligibility.

Testing of various intelligibility and audio accessibility features is currently under way in the New York City area. The speaker expressed hope that the technical document could be ready by the time of NAB 2024 in April.

Ironically, according to the speaker, programs and films on television in the 1950s and ’60s had considerably better dialog intelligibility, given less emphasis on effects or soaring soundtracks.

“Big soundtracks [are] terrific,” he said, “but being able to come up with the means to make a version of that soundtrack that’s more compatible with home listening is really what the solution comes down to. This could be a big year for finally hearing what’s going on when it comes to dialog intelligibility, so to speak.”

Password must contain the following:

A lowercase letter

A capital (uppercase) letter

A number

Minimum 8 characters