SCMS 2018: Unstructured Data Rises as a Key Concern for Media-Management Professionals

With content coming in from all sources on the web, how are media managers keeping it all straight?

For major sports, media, and entertainment brands, managing a content library according to one’s own conditions using one’s own taxonomy can be challenging enough. Now, in the era of digital- and social-media production, organizations are increasingly ingesting content that has been produced and distributed by other people and/or organizations. That has led to a spike in the amount of “unstructured data” within a brand’s facility.

From left: CineSys-Oceans’s Brent Angle, Turner Sports’ Anne Graham, Zegami’s Roger Noble, and PGA Tour Entertainment’s Michael Raimono speak on the issue of unstructured data at SVG’s Sports Content Management and Storage Forum in New York City.

Why is this content making its way into facilities, and how are organizations handling it?

The answer to the first question is simple: media-asset managers want their hands on all of its brand’s best content. For an organization like the NBA, for example, loads of basketball content involving its coaches, players, and other key personalities are flying around social media on a nearly 24/7 basis. At SVG’s Sports Content Management & Storage Forum in New York City last month, Chris Halton, VP, media technology and operations, NBA, shared how the league has met these changes in the media landscape by deeply monitoring digital and social feeds for relevant content related to its property.

When that content is ingested, however, that’s when the real work begins. Automated processes have been put in place to scan that content, make sense of the foreign data attached to it, and work it into the league’s existing metadata workflow.

“We have a very rich, fixed taxonomy and a vast archive that’s about 30 PB,” says Halton. “We log everything. We have all of the information from the start of the league, and we take full advantage of all of it. Every night, we are putting content out on Twitter, Instagram, every platform that you can imagine with the goal of using social media to drive subscription and to drive viewership on television platforms.”

When it comes to working through that unstructured data, organizations like Turner Sports have found it beneficial to develop deeper relationships with the entities producing the content that is regularly ingested by the network. From there, the relationship could be mutually beneficial.

Turner Sports’ Anne Graham is a member of SVG’s SCMS Advisory Committee.

“Really, the way that we are finding out about that,” says Anne Graham, assistant manager, media management, Turner Sports, “is good old-fashioned getting out on our feet and doing outreach with those units, talking with them, offering them our services so we can leverage an opportunity. If somebody needs some help putting together a retention schedule for how long they need to keep something, then, great, we get to inventory their content and see what they have and help them appraise that and decide if that’s going to go into the archives or how long it’s going to stay near-line or be deleted.”

Technology vendors are also playing a key role in assisting major brands with the challenges that come from unstructured data. Data integrators, such as CineSys-Ocean, help clients first analyze just how big their unstructured-data issue is by running indexing services that can scan petabytes’ worth of data in mere hours. It can apply existing taxonomy structures to that data and create new ones to accommodate the outside content in a usable way.

“I don’t think anybody had the payroll to have people search through millions and millions of files that end up accumulating,” says Brent Angle, CTO, CineSys-Oceana. “The first step for us to actually find what they have and what the value is. Some of this data doesn’t have any value; it’s just bits and pieces from the sausage-making process that they don’t care about.”

Some vendors are even using machine learning to speed up and automate the process of identifying and organizing content with unstructured data. Companies like Zegami are doing so with even unsupervised machine learning.

“In that way, we can rapidly use that combined with our interface to quickly begin to start adding metadata and logging to a taxonomy where one didn’t exist previously,” says Roger Noble, co-founder/CTO, Zegami. “It’s combining that as a rapid–data-entry tool. We combine that also with similarity search so you can find content based on how it looks visually and surface up all of the other content that matches those similar characteristics, whether it’s a face or a logo.”

Password must contain the following:

A lowercase letter

A capital (uppercase) letter

A number

Minimum 8 characters