Sustainability, Streaming, and the Data Attribution Challenge

At the moment, "sustainability" is one of the most common buzzwords in the streaming sector. To be fair, this buzzword is more important (and broadly engaging for wider society) than "3D," "metaverse," "NFT," or "AI:" Unlike with these fads, everyone is forming a view on the power efficiency issues surrounding streaming.

The problem is that, with so many trying to establish a "sustainability advantage" for their marketing purposes, there are abundant sustainability claims based on siloed and isolated logic, and case studies of "improvements" on what sounds nice in marketing material, but are actually relatively meaningless when you explore the validity of the claim in a broader context than solely an individual element of the streaming supply chain.

This adds up to many—frankly, I would venture most—companies trying to do the right thing, but inadvertently greenwashing over their real impact as a business while focussing on a single elementary improvement. For example, arbitrary promises to reduce "90% of the power demand of your streaming workflow" for what turns out to be just a small discrete sub-system element of a larger end-to-end workflow might actually mean a saving of just (for example) 0.5% of the total system energy. Worse, it may in fact cause upstream or downstream power demands to soar by many times over.  

Just because a new video compression might save 10% bandwidth, we must ask ourselves if that actually means the underlying operators will scale down the available infrastructure by 10% of if transcoding a vast library of long-tail content that no one will ever watch in any volume,  then the energy used in the transcode may far outweigh the increased infrastructure power demand required to distribute those occasionally viewed streams in their older "10% less efficient" format. Streaming workflows need to be considered extremely holistically to be able to ensure that atomic improvements in efficiency in subsystems in a workflow actually improve the end-to-end efficiency—particularly when you factor in the cost of change to downstream business operations and so on. Rolling out a new codec has huge costs, not only commercially but sometimes in churn of hardware and the provisioning of that new hardware. 

The science has not yet developed a full picture, and at best even the most robust sustainability focussed analyses of digital media workflows typically end up with a deep understanding of one horizontal and yet have poor granularity over most of the other vertically integrated layers.

I personally believe that this challenge is exacerbated by a culture of thinking that has been evolving with the power and climate science surrounding streaming and digital media: the technique of data attribution. In a multi-tenant/multi-service/multi-user shared infrastructure environment such as the internet, it has become common practice to attribute only a proportion of the energy required (to make the service available) to each particular service provider or user. Simplistically, if you share a cloud server with another cloud user, and only 10% of the data traffic or 10% of the CPU cycles are used by your service, then you take responsibility for 10% of the power. If you only use that 10% for an hour a day then you will attribute only 1/24 x 10% of the servers power to your service.

If that server is powered by a (again for simplicity) 100W server then your service is argued to be using only ~0.4W of that server's power, and that is what your "carbon accounting" will then use to contribute to your company's sustainability story.

This, at first sight, seems entirely fair, doesn't it? If many service providers are using infrastructure, then logically that is the only way to divvy up the responsibility. It provides an easy method of carbon accounting and to make the picture look clear-cut for the individual company. The less data or CPU used, the greener the operation. The more you share the infrastructure, the less you need to feel responsible for the fact it is powered up and using energy all the time.

Yet it is highly misleading. This model ignores the key fact that in order to make the service available ("availability" being a word of paramount importance here) that server is either dedicated to your service and can be turned on and off—at which point when it is on it is likely to be the entire 100W you are actually using for the short duration—or if it is a shared resource that is always on, then in order for your service to be active on an ad-hoc basis that resource needs to be available 100% of the time—so for it to be available to you to use (regardless of how much you or others actuallyuse the service) there is a 100W energy use for the entire time it is available.

Now obviously busy CPUs use more power than quiet ones, and if anything this further compromises/complicates the attribution models in a shared infrastructure. But this generally distracts from the real issue that the server is ON all the time. Data attribution tends to allow operators to move workloads to the cloud and offload their power requirements in a way that accountants love but makes almost no difference to the energy being used. "Net Zero" is achieved … by outsourcing the problem off the books.

An Alternative Energy Attribution Model

To highlight the challenge to the Greening of Streaming members, at the December meting I shared a high level diagram that I drew up to show all the telecoms infrastructure that has to be plugged into the electricity grid to enable a live stream to be delivered.

While I initially did this for my own interest, it has engaged so many people I want to share it with you here (click for full-size version):

End to End Energy for Streaming

As you can see it shows all network layers, and all the physical hardware that needs power in a schematic model that can be generically abstracted for fiber, copper, cellular and wireless access.

Each "router" can be thought of as a branching point. As you flow left to right, at each router the entire right hand side subsequent to the router may often be replicated as the origin to edge distribution fans out to scale up for demand.

So as you can see that is a lot of infrastructure that has to be available end-to-end for a live streaming service provider to offer that stream. And that's regardless of whether you deliver one stream, or one stream per branch, or a million streams …  

Today, data attribution models only account for all of that infrastructure while data is being passed along the chain, even though the chain is often available all the time.

Currently there is considerable work (IBC/Dimpact etc) focussing on the set top box and stream decoder/CPE (the orange box on the diagram) and there is much focus on the production side (Albert/BAFTA), which is technically pre-encoder in my diagram.

Few, if any, reports so far have really had industry-wide introspection on what power is required in the distribution chain to make streaming services available to mass-market audiences.

While much of that infrastructure may be shared ("70%+ of all network traffic is video" etc), with each share a sense of reduced individual responsibility grows, and this lessens the individual operator's sense of "corporate social responsibility" to the problem. However in practice nearly all of this infrastructure is "on" nearly all of the time, and additionally as each service provider adds their demand to that infrastructure, the infrastructure has to scale its peak availability to ensure that all those services can be delivered. So reducing data flows rarely reduces power, at the moment.

Data attribution risks causing a dust cloud of data points that look good out of context, but are meaningless if the industry isn't joined up end-to-end and taking responsibility proactively rather than trying to defer responsibility via careful accounting attributions to being someone else's responsibility.

Indeed, it may be that if we ask the right organisation in a delivery chain to do something less power efficient in their own domain, that all the downstream infrastructure providers and devices could be many many times more power efficient with little or no change or effort. Such a move may not be the most obviously commercially beneficial move, but it may in fact produce massive benefits right through the end-to-end system.  

One thought experiment I have heard thrown around has been the idea of rolling everything back to MPEG-2 (low energy decodes, often DSP-supported thanks to codec maturity, broad extant technology support and more….). In terms of the latest codec features and compression ratios, reverting to MPEG-2 is a horrifying idea to anyone who has been working in this sector, but potentially a very energy efficient way to deliver video with IP networks.

The assumption that data efficiency always equates to power efficiency is unproven in my mind, and we must be cautious to simply use data rates (which are admittedly easy to measure for streaming content) as an indicator for power and therefore "green" credentials.

I take issue with blind-acceptance of data attribution models, and I think the industry needs to robustly test and question those measures before using them as part of our marketing and sustainability story. In the meantime, we need to take on the responsibility of looking at where we can practically engineer for reduced power usage across the board, leveraging all our expertise in running high-scale distributed complex systems to ensure we collectively do that as power efficiently as possible. We also critically need to talk across our vertical integration points to ensure we actually reduce power consumption rather than just offload it to the next layer of the supply chain and then use data attribution to ‘dissolve' the responsibility. 

[Also worth noting: the data attribution model tends to take the focus away from the fact that one streaming operator may achieve the same as another but with vastly varying energy impacts in doing so. Having 100 sales executives touring the planet in aeroplanes to achieve their targets may not compare favourably to an energy-smart virtual sales force achieving the same target.  This joined-up accountability of energy needs to be considered as part of the whole picture when a company declares it is "green" or "net zero."]

I am sure other members of Greening of Streaming will hold their own opinions, but this is very much the discussion and focus that I find resonates when I talk to other members.

If you would like to join, and weigh in on this or other discussions or get involved in our first major project this summer; to establish power usage across the various elements in the chain for a major live streaming event,  please reach out, and if you are working on best practice ideas for sustainability and power efficiency projects in the streaming sector we would be keen to hear from you too!





