NAB 2023

NAB 2023: DTV Audio Group Meeting Tackles Loudness in Streaming, Versioning, Automation

The DTV Audio Group’s annual NAB Show meeting on April 17 dealt with a variety of issues facing broadcast audio, particularly in broadcast sports.

The gathering started with an off-the-record discussion focused on streaming-audio regulation — or more precisely, the current lack of it. The CALM Act, which putatively regulates relative loudness of commercials inserted into broadcast television, took the better part of a decade to formulate and enact (although enforcement has been another matter). On the other hand, television programming, including sports, has moved rapidly into a streaming-media environment, where technology practices are largely unmediated. Addressing the loudness issue for OTT streamcasting may be more complex than it was for OTA broadcasting, given the sheer numbers and types of distributors. However, solutions can be achieved, according to a consensus at the meeting, with input from the broadcast-audio community in partnership with Federal agencies.

Accessibility — focused on captioning, subtitles, and alternative languages as part of the ATSC 3.0 initiative — was another area of discussion. A 10-minute sports clip was reported used (though not shown) as the basis for continued development of solutions.

Reduced Versioning Time

A presentation explored Skywalker Sound’s CODA workflow, which has been shown to reduce versioning time for feature productions from a number of weeks to a matter of hours. The need for multiple iterations of content productions is driven by massive growth in distribution outlets and language-version requirements. The legendary Bay Area audio-post facility’s new software platform — aka the Automated Media Ecosystem — is cloud-native and automates creation of soundtrack versions, cutting the time and cost of the deliverable process. The system has already been used on premium Disney+ releases The Mandalorian and Moon Knight.

Automating international and multiformat soundtrack processing from the highest original-source mix format (often the Dolby Atmos mix) can create versions for international markets with the same fidelity and detail as the original language. Because the process is automated, the need to human-QC each pass is reduced, saving time and costs for content owners. Future iterations will address automatic generation of M&E tracks and other functions. However, in the Marvel-ous universe of very loud and explosive SFX for film and television, dialog intelligibility remains a challenge.

AI for Automating Sports Audio

Salsa Sound, one of a handful of innovators that created “augmented” crowd sound for broadcasts during COVID sports-venue shutdowns, delivered a presentation on how artificial intelligence can be harnessed for mix automation for sports audio. Applied during the Qatar World Cup last year, the company’s MIXaIR system connected with Lawo’s Kick video system, which automatically tracks the ball on the screen, to do the same for the related audio. In some cases, when the ball was too deep into the pitch for the kick sounds to be captured by open microphones, MIXaIR had carefully curated samples ready to sync in, selected and triggered (also automatically) by the ball’s rotation and spin velocity.

Besides being fascinating, the technology has significant practical advantages, including providing the basis for broader use of predictive mixing — it’s algorithmically able to predict the action on the field seconds ahead and have the faders respond, in part by analyzing crowd sentiment based on its noise — that could replace the current reactive mode of mixing that follows the action from behind. That, in turn, could lead to cost reductions, thanks to the fewer microphones deployed and concomitant lower CPU expenditure.

In fact, automation of generally manual tasks was an underlying theme of the meeting. It extended to a Sennheiser presentation on its AMBEO platform, which can automatically straddle the threshold between stereo and immersive sound. AMBEO’s two-channel spatial+audio algorithm translates an original immersive or surround mix into two channels of audio to deliver a spatial experience in a stereo-monitoring environment.

Like many things in a post-pandemic landscape, this year’s DTV-AG meeting was briefer and more compact than traditional (the 8 a.m. start and its timing in the middle of the NAB Show schedule may have contributed to that).

Password must contain the following:

A lowercase letter

A capital (uppercase) letter

A number

Minimum 8 characters