SVG Sit-Down: Ericsson’s Matthew Goldman on SMPTE ST 2110, the Future of IP, the HDR Format Wars

The company’s mantra at NAB 2017 was ‘Transforming Television’

Much of the talk at NAB 2017 revolved around the move to IP and HDR — and the related standards. Few people are more plugged into the standards discussion than Ericsson SVP of Technology, TV and Media, Matthew Goldman, who began his two-year term as SMPTE president on Jan. 1.

SVG sat down with Goldman during the show to get the latest updates on standards for both IP and HDR, as well as on the industry’s move toward virtualization. Among the highlights, he noted that the ST 2110 IP-transport standard is just months away from official approval, that a massive ST 2110 IP interop demo featuring 40+ companies at NAB 2017 showed how far the IP ecosystem has come in just a few years, that broadcasters can prepare for an IP future without sacrificing existing SDI workflow, that the current HDR format wars will play out and where dynamic metadata factors in, and that virtualization is changing the broadcasting business model.

Matthew Goldman: “What we were trying to do is solve a broadcaster’s bandwidth problem, not an HDR-technology problem.”

Where are we in terms of a standard for live HDR production?
We’ve moved a step forward. All the baseline standards for HDR are done at this point in time. Now, as you know, there are several [formats], including several vendor-specific [formats] like Dolby Vision. Also, Samsung and Google recently announced what they’re calling HDR 10+. These HDR mechanisms vary on how they perform tone-mapping or gamut-mapping technologies and the details of the metadata used.

However, the baseline for all these [formats] has now been codified for program exchange. The new Recommendation ITU-R BT.2100 defines two HDR transfer functions: perceptual quantization (PQ), which is SMPTE ST 2084, and hybrid log-gamma (HLG), originally developed by BBC and NHK.

All HDR systems are based on either PQ or HLG. The simplest systems combine the HDR transfer function with a wider color space for Ultra HD (wide color gamut, or WCG) and deeper sample bit depth (typically 10-bit, vs. today’s direct-to-consumer formats, which are 8-bit). The two most basic systems are known as PQ10 and HLG10, which are well-understood and deployable. HDR10, a commonly used HDR system, is PQ10 plus added static metadata, based on SMPTE ST 2086 and two other parameters defined by the Blu-ray Disc Association.

What about the inclusion of dynamic metadata into these formats?
Dynamic-metadata [capabilities] are now basically done, standardized in the suite of SMPTE standards known as ST 2094-x. For example, dynamic metadata for Dolby Vision is defined in SMPTE ST 2094-10 and for Samsung HDR+ in SMPTE ST 2094-40. Used in some preproduced HDR content (such as Hollywood motion pictures), it hasn’t really been deployable by the industry in a live production setting, such as for sports.

The Ultra HD Forum Guidelines give guidance to providers of early HDR deployments for live production and distribution, recommending the use of PQ10, HLG10, and HDR10. DVB and the ATSC have pretty much done the same thing. The use of HDR systems that use dynamic metadata are being vetted in the ATSC and DVB at the moment, but the industry is a little bit hesitant about the added complexity, so some providers are moving forward without it. The complexity is that, unlike systems without metadata or with static metadata, dynamic metadata may change on every frame. Dynamic-metadata HDR systems have improved over the past year, and so, at some point within the next few years, many of the concerns with using an HDR system with dynamic metadata will be alleviated and well-understood. Even with some of the uncertainty right now, HDR is deployable today, so, with careful planning, there is no reason a broadcaster or a content provider should be scared of HDR at all — regardless of the dynamic-metadata question.

Can a fully end-to-end live HDR playout ecosystem be delivered today?
Absolutely. For instance, at [NAB 2017], we [showed] how HDR is not a major challenge in a live workflow. We demonstrated a live end-to-end 4K HDR ecosystem in our booth, a solution that is easily deployed today. We used a playout hub in Hilversum, Netherlands, and [played] out uncompressed 2160p60 4K HDR material using PQ10. It [was] compressed to contribution-level bitrates using the latest HEVC video-compression technology in our AVP2000 contribution encoder, then sent via fiber from the Netherlands to our NAB booth. It [was] then decoded back to baseband and recompressed to direct-to-home bitrates, using our MediaFirst Encoding Live solution, then decoded and displayed on a consumer 4K HDR television.

This is the first time we demoed a single stream, single slice for HEVC contribution-level encoding. Previously, the only option for contribution-level encoding of 4K was to separate the 4K signal into four 1080p feeds and encode as quad, synchronized AVC streams; deliver the quad streams in a synchronized manner; then use four 1080p receivers with phase-aligned clocks (we call this mechanism SimulSync) to reassemble the quad streams into a single 4K signal again. Now we have a single-striped, single-raster HEVC contribution-level encoder and a companion 4K professional integrated receiver-decoder.

The bitrates can vary, but it’s not unusual to use 40-80 Mbps for 4K HEVC contribution-level live encoding. The re-encoded direct-to-consumer bitrates are much lower, in the range of 15-25 Mbps for 4K HEVC live encoding.

There is a lot of talk about 1080p/HDR serving as a stronger near-term option than making the move to 4K for broadcasters. Would you agree with that?
For OTA broadcasters in particular, it absolutely makes sense. Ericsson actually drove this message starting three years ago, and we received some criticism from some in the industry. But what we were trying to do is solve a broadcaster’s bandwidth problem, not an HDR-technology problem. Broadcasters have a limited amount of bandwidth, particularly an over-the-air broadcaster under new spectrum-repacking schemes. Even with the best compression technology, 4K requires circa 250% more bandwidth than HD. In compression, HDR adds 0%-20% vs. SDR in a compressed stream. Hence, I coined the term “the best bang for the bit.” Even the average TV viewer will notice a huge image- quality improvement, when comparing today’s HD (720p60 or 1080i SDR 8-bit) to a new 1080p60 HDR service.

And there are still facility issues with moving around uncompressed 4K signals, which require 12-Gbps interfaces and pipes vs. 1080p signals, which require only 3-Gbps interfaces and pipes.

As SMPTE president, can you provide an update on release of the final ST 2110? How do you expect this to accelerate broadcasters’ move to IP?
We are obviously seeing a lot about the broadcast move to IT infrastructure, and the final draft standards of SMPTE ST 2110-10/20/30 (system concerns and uncompressed video and audio streams) already have been approved. SMPTE is in the process of resolving the final comments and then will issue a final ballot to move to Standard status. This will happen over the next few months.

The basic parts of the suite of ST 2110 standards are technically stable right now. The big thing we [did] at NAB [was] showing a massive ST 2110 interop with 40+ vendors, including Ericsson. In addition, we also [demonstrated] a “mini” ST 2110 interop in our booth with Grass Valley and Tektronix.

The move to all-IP also involves another big change in the broadcast or production facility, because the facility is moving from the venerable SMPTE 12M timecodes and black-burst time synchronization to IEEE 1588 precision-timing protocol (PTP) and SMPTE ST 2059 broadcast profile for PTP. SMPTE ST 2110 audio, video, and ancillary data streams are all synchronized using PTP timing, so that they can be carried as separate streams in a standard IP network.

That opens up a lot of new possibilities. Intrafacility now can be all-IP, which basically means, instead of having two separate sets of switches — an SDI switch for video and Ethernet IP switches for everything else — one can now have one common data-center infrastructure. Clearly, smart [operators] will still separate traffic by priority, but the newer switches are very smart that way and will prioritize real-time media streams anyway. The capability is all there. We’re breaking new ground here.

With ST 2110 almost here, what would you say to broadcast facilities that have already committed to other IP formats, such as ASPEN, NMI, or NDI?
SMPTE ST 2110 is part of a movement toward one common IP mechanism for broadcasting. According to a survey conducted by the Alliance for IP Media Solutions, 70% of AIMS members are going to have ST 2110 equipment this year. If one is deploying an IP-based solution at the end of this year, then a SMPTE ST 2110-based solution is plausible. This is a huge change from just a year ago and why the NAB [2017] interop in the IP Showcase was so important.

Do the solutions do everything that is outlined in the JT-NM [Joint Task Force on Networked Media] open roadmap to interoperability, which includes auto discovery and connection management? Not yet, but neither does any other solution being offered. But the basic media-essence aspect of it is ready.

The industry is now behind a common method for implementing all-IP. That’s why 50 or 60 vendors and operators are now members of AIMS: to drive interoperability behind one standard so that everybody in the industry would benefit.

Another big theme at NAB 2017 was virtualization. How is Ericsson looking to move toward more-virtualized workflows and away from the iron-appliance model?
We [demonstrated] Media First Encoding Live [at NAB 2017], which has now been completely virtualized. Everyone has heard of network-function virtualization and software-defined networking. Well, this is media-processing–function virtualization and software-defined media processing. It’s the same concepts, but applied to media processing.

The basic functions performed within encoders, transcoders, packagers, etc. are all virtualized: that is, converted to micro-services with the software code written in such a way as to completely abstract it from the underlying hardware. In this fashion, instance of each micro-service can be launched in the cloud (public or private) or combined into a local appliance. And we changed the licensing structure to support a service-oriented model.

We’re not leaving the appliance model behind but rather offering our customers more flexibility in how they deploy our solutions. The same virtualized code that runs in the cloud can also be run in a single physical appliance. Even if you have it in the data center somewhere, at the end of the day, software has to run on something, whether it’s a private cloud, a public cloud, or in one’s facility.

Ericsson’s theme at [NAB 2017 was] “Transforming Television,” and that’s what this is really about. We have not abandoned purpose-built hardware; we’re just using it smartly. In one case, we have a soft launch of our new platform, which is known as Media First Content Processing. The first functionality on this platform happens to be a real-time 4K professional decoder/receiver. But the state of an all-COTS [common off-the-shelf] server platform will result in several seconds of latencies when decoding 4K HEVC, while a hybrid approach enables sub–half-second latencies to be possible.

Where do you believe the industry will be a year from now in terms of live IP workflows?
It all depends on how much automation vs. manual configuration one wants to do. We’re there with regards to the media essence streams. A year from now, it will be mature. Right now, configuration must be done manually, but, once auto discovery, registration, and connection management is implemented (some of this is specified now, but some parts are still work in progress), per the JT-NM roadmap, configuration will become automatic and dynamic. That’s one of the cool things about moving to all-IP.

You still have to be smart in how you manage the traffic, of course. One of the new standards in the SMPTE ST 2110 suite that is still a draft, ST 2110-21, is all about sender-traffic shaping. [With] regular IP traffic, you just burst it right out, at full line rate. To have guaranteed quality of service, traffic shaping needs to be done to manage the peaks and prioritize bandwidth usage.

Password must contain the following:

A lowercase letter

A capital (uppercase) letter

A number

Minimum 8 characters