Deutsche Telekom Presents World-First 6K VR Live Stream
The jury may still be out on virtual reality (VR) as a live event experience, but that's not stopping service providers intent on proving it can offer viewers the best seat in the house.
The latest to try is German telco Deutsche Telekom, which pulled off a world-first over the weekend by live streaming VR to consumers in 6K resolution.
"VR has to move beyond technical gimmicks and beyond a showcase," explains Stephan Heininger, Deutsche Telekom's head of virtual reality. "From a business point of view, our goal is to gain market share in VR/AR. We want to move beyond proof of concept into regular monthly or weekly live VR productions for multiple sports and music events."
It broadcast a basketball match between Bonn and Oldenburg from the Telekom Dome in Bonn to several thousand viewers using the free OTT app Magenta VR on Android smartphones, Samsung Gear VR, Oculus Go, and Daydream headsets.
The broadcast gave users the ability to switch between views from two 360° cameras, one positioned above one of the baskets and another at the half-court line.
"Most current live streamed VR is output as an Equirectangular Projection, which typically sends a single 4K 360° picture to every device," explained Carl Furgusson, VP portfolio strategy, MediaKind. "You would need 25-30Mbps in the home to realistically receive it any kind of decent HD resolution"
The telco's aim, with the help of MediaKind, is to cut the bit rate whilst improving the picture quality.
"Part of the problem with equirectangular views is that you are encoding and sending the whole 360° image when a user is only ever looking at 12% of the total picture at any one time," says Furgusson. "This is a waste of bits."
Instead, DT's R&D team T-Labs are using a process devised by Rotterdam-based Tiledmedia called ClearVR. This software segments the 360° video into 'tiles' of around 500x500 pixels and transmits only the tiles that are actually visible in the user's direct field of view. At the same time, a lower (2K) resolution copy of the panorama is transmitted essentially ensuring there are no black holes when you turn your head.
All this has slashed bitrates to 8-12Mbps.
Diving into more detail about the workflow: The courtside live VR cameras were small Z Cam S1's each recording four 4K feeds at 30hz from each of the unit's four wide angle lenses. These individual streams are fed to live production software from Imerve (developed by former members of Nokia's Ozo VR team) running on Nvidia GPUs for stitching in equirectangular format in 6K.
That file is HEVC encoded and contributed over 1Gig fibre at 200Mbps from the stadium to a local ISP peering point and from there to Google Cloud. There it undergoes cubemap conversion and segmented into tiles, encoded, and packaged in five second bursts for sending to Akamai, the origin server and CDN.
A ClearVR SDK in the Magenta VR client decodes the file and retrieves new tiles from the cloud, adapting to local bit rate. It also buffers a number of tiles at the client.
"The original Tiled Streaming technology was designed to support adaptive streaming, with multiple layers to allow zooming and panning in ultra-high-resolution imagery," explains Frits Klok, Tilemedia founder and CEO. "We applied these principles to VR streaming, where the 'client' has the logic and the flexibility to retrieve the layer that best suits the viewport and network conditions."
In the demo I witnessed on both a tablet and an Oculus over Wi-Fi in the arena, this gap was barely discernable; indeed it happens within 20-40 msec, although there were frequent pauses in the live stream. The VR stream used an audio mix taken from the 2D live broadcast, produced by NEP.
The 2K base layer is transmitted at 2Mbps. This fallback layer ensures there are no black holes while new tiles are fetched, and also takes care of an incredibly short 'motion-to-photon delay' (the delay that will make you sick if it's too long).
"That delay is as low as it can possibly be, because it only depends on the local processing," says Klok. "Typically, there will be over a hundred tiles. These are independently coded and stored on CDN, where the client can find them. The client has the logic to request the tiles it needs, decode them, and then rearrange them for rendering on the device."
All of which brings glass to glass latency in this demo to around 30 seconds, although the team believes they can cut that by half by working with 2-second chunks of video and experimenting with protocols like SRT rather than HLS.
"It's a trade-off between building in fault tolerance throughout chain and taking the latency down," says Fergusson.
MediaKind's main role in the demonstration was to transfer the streams in and out of the cloud and manage the parallel encoding in between.
"We want to offer VR encoding as a service," says Fergusson. "The workflow demonstrated here is inherently scalable since as you go up in resolution you can scale up the massively parallel encode process."
He said that Google Cloud was not only cheaper than AWS on this occasion but that on a practical level Google hosted GPUs were nearest, in a data centre in Amsterdam.
As for the HMD experience itself, it was decent and shows a maturing of this type of live event workflow. The cubemap image aggregated from the multiple 4K lenses may have been 6K but the final output was HD.
"The challenge for VR live streaming is that you ideally need source resolution in the range of 8K-10K to generate a 360° image that won't pixelate," says Fergusson.
This is possible he thinks using arrays of higher end cameras from Blackmagic Design and Red, or rigs designed at Germany research institute Fraunhofer. Such cameras would also have greater dynamic range and speeds to capture pictures in the ultra-bright lighting conditions of an indoor arena. They are also bulky and wouldn't be as discrete or portable at games as Z-cam type models.
Another bottleneck is that most HMD's are HD. Another is that the stitching production software isn't capable of handling that data in real time yet.
"But we want to push the resolution up to 8K or 10K, and that will be one of our next tests," says Heininger.
Tiledmedia partnered with Intel at IBC 2018 in September to show a live 8K VR workflow.
Deutsche Telekom has even made a test of 32K in its labs, according to Heininger.
5G rather than fibre could be substituted for backhaul, and the next-gen wireless network would open up the bandwidth for transmission, but Deutsche Telekom are impatient to get a service up and running.
"We don't want to wait for 5G," says Heininger. "We believe there are applications for great quality live VR today and we believe it must be interactive (social), not passive."
Feedback from this test will help the telco adapt the experience for other sports properties it holds including ice hockey and the third division of football league Bundesliga
"Basketball has a small court, a big ball and a slow game—all things that work well for VR," he explained. "We want to move to hockey which has fast action, small pucks, and bright arenas and then get this into the bigger, outdoor pitches of soccer stadia. When home games at clubs like Bayern Munich are over-subscribed the club can monetize a VR position from the stand many times over for fans wanting that 'at the game' experience."
Production company Magnum Films managed the whole project and also fielded a number of other VR cameras for post producing into a 6-minute highlights video of the game to sit on the Magenta VR app.
"The biggest challenge for us doing this was that this was our first live VR so it's all a learning process," said project manager Stefan Kleinalstede. "For instance, we have analysts and a host presenter, but they need to remember that they are always in shot."
Iconic Engine, the AR/VR/MR development wing of LA visual effects facility Digital Domain, designed the Magenta VR app.
Prouction crew monitors the stitching into equirectangular from the camera feeds
Interest in VR has waned, even as the technology has improved. But combine it with augmented reality and mixed reality to form extended reality (XR), and things get interesting.
"AI will be woven into every aspect of our lives" as the 5G-enabled cloud takes prominence and phones diminish in importance.
BBC R&D is not convinced of the case for producing news in VR, and warns against the perils of relying too heavily on tech company sponsorship of branded 360° or VR news content