CDN Optimization for VR Streaming

Virtual Reality (VR) 360° content is still very new, but that does not make viewer expectations any more relaxed. If anything, the required quality level is even more important with VR content due to the expectations for immersiveness, which can be broken by just a minor glitch in delivery.

Virtual Reality content is unique in many ways, the first of which is that VR video even has physical implications in a way that other video does not. A low-quality VR video may cause motion sickness, for example, creating a lasting, discouraging perception of VR experiences.

Additionally, 360°VR content by its very nature is extremely voluminous, which creates new challenges for VR streaming providers. Not only is there more video to stream to create a picture surrounding the viewer, but the quality must be very high.  For example, YouTube recommends uploading 360° videos with a bitrate of 150 Mbps, which would make a five-minute video approximately 5.5 GB. For reference, the recommended bitrate for a standard video in 4K resolution is only 35 to 45 Mbps.

Streaming such high-quality 360° video uninterrupted and with minimal buffering requires a great deal of network bandwidth. While that level of capacity may be available on managed networks such as cable TV, delivering 360° video with acceptable quality of experience is a challenge on unmanaged networks like the internet. The open nature of the public internet means that consumers compete for bandwidth to receive content and the path it takes to the viewer may not be consistent, even packet by packet.

Viewport Adaptive Delivery

One of the most common and efficient methods to decrease the bandwidth required by 360° content is to deliver only the content in the user’s current Field of Vision (FOV) in high quality while delivering the rest of the video in low quality. By prioritizing the content that the viewer is actually seeing at a given moment, providers can deliver high-quality experiences while preparing for a viewer’s every move without wasting bandwidth.

This is achieved by dividing each video frame into smaller pieces, called tiles. Tiles can be individually delivered based on the user’s current FOV and put back by the client before delivering it to the video decoder. This technique of adapting the video stream based on the current viewport has been used as the basis for several different methods by different organizations.

One such method is called Tiled-Based Adaptive VR streaming. In this case, tiles are served by using a custom packaging format of the video asset such that it provides random access to the tiles of a frame. High quality tiles are fetched from the origin by content delivery network (CDN) edge servers using HTTP byte-range requests based on the client FOV, while the URL in the manifest for the video remains unchanged. This technique makes it easier to adapt to network conditions and adapt to new angles as the viewer turns his or her head.

Figure 1.  Splitting a 360° VR video into tiles at the origin server makes it easier to deliver to different kinds of devices at different quality levels across the CDN network.

Figure 1. Splitting a 360° VR video into tiles at the origin server makes it easier to deliver to different kinds of devices at different quality levels across the CDN network.

Proximity-Aware Content Prepositioning

Another proven bandwidth optimization technique involves prefetching tiles that will be needed in the future based on client proximity and head movement of the user. Such methods proactively load predicted contents into a CDN cache server. This reduces the time it takes to switch a low-quality tile with a high-quality one in the user’s FOV by more than 50 percent, compared to not prefetching on the CDN.

Moreover, since tiles can be treated as any another simple binary object that needs to be delivered from the origin server to the client via the edge over HTTP or HTTPS, tile delivery can be optimized in several different ways. Hence, other tiled based approaches of VR streaming such as the one proposed by Fraunhofer HHI which delivers customized bitstreams for each user on the fly can also benefit from such prefetching.

Figure 2. An edge server can pre-fetch and cache data so it is available when needed by the viewer. This is key to ensuring a smooth playback.

Figure 2. An edge server can pre-fetch and cache data so it is available when needed by the viewer. This is key to ensuring a smooth playback.

Next-Generation Protocols

A common approach to media delivery optimization is switching to less chattier protocol to reduce overhead. Some newer protocols that are ripe for VR delivery such as QUIC and HTTP/2 are already supported out of the box and ready to be experimented with. A presentation at a recent ACM Multimedia conference examined how HTTP/2 server pushes increase throughput especially in mobile, high RTT networks.

Figure 3. HTTP/2 will enable applications to run faster, simpler, and be more robust by resolving some of the limitations of HTTP/1.1. The standard also will open many new opportunities to optimize applications, add features, all while improving performance.

Figure 3. HTTP/2 will enable applications to run faster, simpler, and be more robust by resolving some of the limitations of HTTP/1.1. The standard also will open many new opportunities to optimize applications, add features, all while improving performance.

Ad Insertion

360 video presents a new avenue for ad-based monetization. For example, dynamic ads can be inserted at different spatial locations in the 360° video space as the video is being played out. If tile-based encoding is used then some of the tiles can be overlaid with ads at run-time.

These tile-based ads can be stored via the cloud and delivered to the client with minimal latency. For a much more dynamic experience, edge servers can pull tile-based ads from third-party ad servers at run-time based on ad targeting rules. To make the most of the opportunity presented by advertising in VR, content providers should target individuals based on their demographics rather than on the content they are viewing, which allows for more relevant ads to be delivered in each session. This requires an advanced platform that balances connections to an ad decision-making server with content delivery.

Getting It Right

As VR gathers momentum, it will be increasingly important for content providers to get the video delivery right. Early adopters will need to validate the experiences that VR providers are promising in order for the market to grow. However, that means VR providers need to make good on those promises by delivering high quality video.  Irrespective of the techniques used to transmit 360° video, mature capabilities exist that can be used to achieve a high quality of experience and usher in the new era of 360°content.

Vishal Changrani, Enterprise Architect, Global Consulting Services, Akamai Technologies (left) and Eugene Zhang, Enterprise Architect Director, Global Consulting Services, Akamai Technologies (right).

Vishal Changrani, Enterprise Architect, Global Consulting Services, Akamai Technologies (left) and Eugene Zhang, Enterprise Architect Director, Global Consulting Services, Akamai Technologies (right).

You might also like...

Expanding Display Capabilities And The Quest For HDR & WCG

Broadcast image production is intrinsically linked to consumer displays and their capacity to reproduce High Dynamic Range and a Wide Color Gamut.

C-Suite Insight: The Broadcast Bridge In Discussion With MainStreaming CEO Tassilo Raesig

Tassilo Raesig became CEO at MainStreaming this year straight from being CEO of the German Streaming Service Joyn (part of ProSieben). We sat down with him to discuss his unique perspectives on the state of the streaming industry from the…

Standards: Part 20 - ST 2110-4x Metadata Standards

Our series continues with Metadata. It is the glue that connects all your media assets to each other and steers your workflow. You cannot find content in the library or manage your creative processes without it. Metadata can also control…

HDR & WCG For Broadcast: Part 2 - The Production Challenges Of HDR & WCG

Welcome to Part 2 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 2 discusses expanding display capabilities and…

Great Things Happen When We Learn To Work Together

Why doesn’t everything “just work together”? And how much better would it be if it did? This is an in-depth look at the issues around why production and broadcast systems typically don’t work together and how we can change …