Upcoming Industry Conferences
Streaming Forum CONNECT [19 August 2020]
Content Delivery Summit [5 October 2020]
Streaming Media West CONNECT [6-7 October 2020]
Past Conferences
Streaming Media East CONNECT [2-3 June 2020]
Content Delivery Summit [1 June 2020]
Streaming Media West [19-20 Nov 2019]
Esport & Sports Streaming Summit [19-20 Nov 2019]
OTT Leadership Summit [19-20 Nov 2019]
Video Engineering Summit [19-20 Nov 2019]
Live Streaming Summit [19 Nov 2019]

Streaming Forum: Metaliquid Innovates with AI and Metadata
New AI implementations allow for content analysis to detect everything from faces to brands and even the type of content, offering value for content owners, brands, and consumers

In a session titled "AI Video Analysis Applications in the Media & Broadcast Industry," at Streaming Forum, Metaliquid head of business development and strategy Tommaso Cesano noted that linear TV is no longer viewers' first choice, since most viewers no longer are content to sit at home and watch television as they try to balance day-to-day life with premium content consumption.

"There's been a huge growth in content," said Cesano, who noted that the company's founders have backgrounds in big data, broadcast, and machine learning. "But we're no longer just users: we are interactive users. How can broadcasters use these new models to engage?"

Metaliquid extracts content descriptive metadata."Media companies have a very important asset in their libraries," said Cesano, "but most know very little about their library."

Deep learning, based on Metaliquid's definition, offers both high performance and real-time analysis, making it more efficient than manual metadata extraction. Neural networks are algorithms that learn by examples and extract general features from them.

Using an example video, Cesano showed face recognition and detection of sensitive content, scene settings, exterior-interior shot settings, and saliency.

From the standpoint of saliency, Cesano showed examples of logo detection, scoreboards, and even Formula 1 cars by distinct patters of logos and body shape. Cesano also noted that saliency identifies an event that is taking place, determines where the viewer's focus might lie, and whether that viewing includes salient brand viewing.

Sensitive content detection can pick up not just nudity but also weapons and blood, determining the content type as well, so that content that is news content rather than movie or TV episodic content to determine whether media alerts or decency warnings would be required.

"Despite being broadcasters' most important asset," said Cesano, "video content has been a black box, with no detailed description of what's happening scene-by-scene."

RAI, the Italian broadcaster, has almost 3 million assets which have yet to be explored, despite being designated as UNESCO Heritage content that needs to be both preserved and cataloged.

Content discovery is also key, both for genres and sub-genres. Cesano used the example of romance movies, where some content is platonic and other content has "more action" that may not be appropriate for all viewers. This also has applicability to e-commerce options for a more personalized experience, including offering an interactive information layer.

"We could also filter by content that has two politicians speaking to one another in a particular place, said Cesano.

"As voice control becomes more prevalent, we can see talking to the TV as being beneficial," said Cesano. "If the match started ten minutes ago, and you've missed the beginning, you could easily ask your TV to give a recap of what's happened from the beginning, without having to revert to watching the entire game from the beginning."

"Thanks to AI, we can understand when particular events happen, and not show brand insertion at that particular moment," said Cesano, using as an example a car crash during a motor racing event. Cesano wasn't initially clear on whether this would be automated brand removal at the time of the accident, but it is conceivable that brand removal could be an actionable use of brand recognition for real-time metadata analysis.

Finally, Cesano addressed real-word problems that real-time metadata analysis solves.

"Thanks to content recognition, we can automate electronic program guide (EPG) correction," said Cesano, if for instance the content being broadcast differs from what the EPG says.

Another problem that could be corrected is audio-to-video synchronization, a specific solution for the broadcast and media industry.

Metaliquid offers both out-of-the-box services, like those mentioned above, with specific architectures around custom recognition requirements. The company is exhibiting at BVE stand N40, which is running concurrent to Streaming Forum.

Related Articles
The machines aren't taking over; they're just helping video publishers achieve their goals more efficiently and effectively.