The Algorithm Series: Video Player Performance
You've finished encoding a fairly tricky piece of content, one that involved a bit more quality control than normal for a few complex scenes, and you're ready to release it for external consumption. But first you need to show it to management, so you upload the stream to your pre-release staging server and text your boss the URL. A few minutes later, you get a text reply asking why the video quality is so bad.
What does the boss mean about the video looking bad, and how do you troubleshoot the problem? Is it a problem with a particular scene, a glitch in the media server, an outdated player on the mobile device the boss is using to watch the video, or even the bandwidth on the corporate VPN?
Welcome to the intricate and complex world we call streaming media.
In the previous article in The Algorithm Series, we looked at the math behind CDNs. The upside is that CDNs deliver exactly what they're given and usually do a very good job at it. But sometimes, the acquisition side (e.g., encoding for on-demand content) introduces an anomaly that passes through the CDN and on to the end user, resulting in substandard playback.
In the scenario I just mentioned, how do algorithms for encoding, transfer, and playback intersect at the end user's player application? That's what we're looking at in this article on player performance.
Encoding and Delivering
"Encode once, deliver everywhere" is a slogan we've heard throughout the history of streaming media, and it's a goal we've achieved with varying levels of success. In the early days, it meant using the proper codec and player combination, as encoders, media servers, and end-user players were all part of the same ecosystem, such as paid solutions provided by Adobe, Microsoft, or Real.
The problem was that "everywhere" only meant the walled garden of one of those proprietary solutions. If a company used Microsoft but its clients used Real, content had to be encoded once for each streaming platform.
Things got a bit better on the encoding front with the advent of H.264 (aka Advanced Video Coding, or AVC), which was often stored in an MPEG-2 or MPEG-4 container format. But then, different flavors of HTTP-based delivery came along—like Smooth Streaming, Adobe HDS, or Apple HTTP Live Streaming (HLS)—which required, at the very least, multiple encodes at selected bitrates (known as adaptive bitrate, or ABR) or multiple segmentation steps to delivery in each of the proprietary HTTP segment sizes and manifest files.
Most of those issues have been resolved, thankfully, with a few proprietary formats comprising the basis for the industry-standard MPEG-DASH approach. At the same time, we've seen Apple's HLS move to a fragmented MP4 (fMP4) approach used by DASH.
So there's nothing to worry about when encoding ABR content, because it will all be delivered based on the proper bandwidth at any given time, right? Yes and no. Here are three things to consider when ABR content is delivered to an ABR-capable player.
How much bandwidth is available?
This is one of the main questions for proper ABR player performance. And it's not just the question at any given moment, but also prior to the given moment, keeping in mind (as most stockbrokers mention in their sales pitch to prospective clients) that past performance is no guarantee of future results. The reason this is key is that so much research assumes optimal decisioning at the player when it comes to which bitrate-appropriate ABR segment to request next from the manifest or MPD file.
At PV '18, the 23rd Packet Video Workshop, Brightcove's Yuriy Reznik and several colleagues presented a paper titled "Optimal Design of Encoding Profiles for ABR Streaming." While it describes ways to model network bandwidth and the probability that a given ABR stream will be chosen (more on that shortly), it's worth considering two different algorithmic approaches to address the scheduling issue.
The first deals with the introduction of a smoothing filter to estimate bandwidth, as noted in "Design of Scheduling and Rate-Adaptation Algorithms for Adaptive HTTP Streaming," an article by Stephan Hesse, written when he worked at Fraunhofer/HHI and funded in part by the European Union's Framework Package 7 (FP7) Open ContEnt Aware Networks (OCEAN) project (see Figure 1). Reznik and his co-authors cite it in their paper as an example of a practical way in which an "ABR streaming client estimates available bandwidth … and then decides which of the encoded streams to pull next" to utilize as much available bandwidth as possible.
"A well-known type of a smoothing filter that we have found suitable for our purposes is the exponential moving average filter," writes Hesse. "Using this filter, the current smoothed bandwidth estimate Ck is obtained as a weighted average of the current bandwidth measurement Tk and the previous smoothed estimate Ck−1," which yields the following formula:
Ck =(1−α)Tk +αCk−1
In this formula (Formula 3 in the article), α ∈ (0, 1), which means α is a specific number between, but not including, 0 and 1. So it's a decimal above 0.00 but below 1.00 that forms what Hesse says is the filter parameter or "smoothing factor."
Hesse goes on to note that the expansion of this recursion yields the following formula:
are effective weights Wi applied to previous measurements Tk-i.
This, in effect, allows weighting to be assigned to particular measurements, which are then plotted "for several possible values of parameter α" to best reliably estimate bandwidth.
"The value of smoothing factor α affects the degree of reliance of bandwidth estimate on past measurements," writes Hesse. "If α approaches 0 the filter becomes all-pass and it simply ignores all past measurements."
However, if α increases, there will be less reliance on the most recent measurement and more reliance on previous measurements. Why is this the case? Hesse notes that the client buffer may be able to absorb some intermittency in bandwidth rather than requiring switching to a different ABR segment bandwidth rate.
"On the other hand," he writes, "we also want the filter to react quickly enough if transmission rate measurements indicate a permanent change of channel bandwidth. This is important to allow [the] quantization unit to switch rate such as to avoid buffer-underrun … situations."
What if we (sort of) ignore bandwidth?
The second approach to handling rebuffering, one that Hesse notes in his dispar.at blog is a potentially better method, is the use of a Lyapunov optimization technique to "minimize rebuffering and maximize video quality" through a buffer occupancy-based algorithm that's been nicknamed BOLA. This approach doesn't measure bandwidth, but instead infers bandwidth availability based on the percentage of segments filling the end user's video player buffer at any given point.
BOLA was introduced in a 2016 paper by Kevin Spiteri (University of Massachusetts–Amherst), Rahul Urgaonkar (Amazon), and Ramesh K. Sitaraman (Akamai). They argue that modern video players that have ad-hoc algorithms are poorly understood and therefore aren't properly utilized when it comes to decisioning about the bandwidth rate for the next HTTP-delivered segment. "[W]e formulate bitrate adaptation," they write, "as a utility maximization problem that incorporates both key components of QoE: the average bitrate of the video experienced by the user and the duration of the rebuffer events."