Live IP Delivery - Part 2 - OTT Technology

In part-1 of this series, Challenges, we introduced the basic concepts of the technology behind live OTT delivery. In this article, we dig deeper to help broadcast engineers and technical managers understand the intricacies of HTTP and IP technology, so they will be able to design and support OTT systems more effectively.

HTTP (Hyper Text Transfer Protocol) has emerged as the dominant technology for internet networks, it scales effectively, and is supported by ISP’s and client devices. HTTP messages generally operate on top of TCP/IP (Transmission Control Protocol/Internet Protocol).

IP protocol does not have any form of receipt validation, so the sender has no way of knowing if the receiver has received the packet. However, for the vast majority of internet traffic, TCP is the Transport layer and IP is the Network layer. TCP generally provides reliable transport and maintains “state” in the connection, enabling retransmission if packets have been lost or corrupted during transfer.

Software Ports

The TCP protocol initiates a virtual connection between the playback device and server and identifies the type of connection required using a port number. IANA (Internet Assigned Numbers Authority) governs issuing of port numbers for specific services. Ports are a logical construct and used by the IP-stack software running on a server to efficiently determine the service type contained within the IP datagram.

To maintain compatibility, IANA define many port numbers for different services, for example, HTTP uses port 80, Network Time Protocol is port 123, and SNMP is port 199. A server receiving IP traffic will monitor the port number within the TCP or UDP header and send it to its service handler without having to decode the entire TCP payload data. This makes server handling of IP traffic much more efficient as the actual HTTP message is only decoded once in the software service handler.

In the case of an OTT delivery, the web-server streaming the video has a software service that listens for HTTP messages and responds with the data requested by the client player device. Although the web-server will keep track of the TCP status, it does not keep track of where in the stream each client player device is. Even in a live OTT delivery, each player device is asynchronous and will request a slightly different part of the stream from the web-server.

Diagram 1 – This diagram shows a simple TCP exchange. Part 1 sends a window of datagrams from the sender on the left to the receiver on the right, the receiver responds by sending an “ACK” message (for acknowledge).  The sender then transmits the next window of datagrams. However, part 2 shows there has been a break in transmission. If only part of the message is received the receiver will time-out and then transmit a “NACK” (for not acknowledged) back to the sender, who responds by re-transmitting the same window of datagrams again. This demonstrates that the time taken for a transfer without error (T1) is significantly less than a transfer with an error (T2) due to the value of the timeout and resend of the packet.

Diagram 1 – This diagram shows a simple TCP exchange. Part 1 sends a window of datagrams from the sender on the left to the receiver on the right, the receiver responds by sending an “ACK” message (for acknowledge). The sender then transmits the next window of datagrams. However, part 2 shows there has been a break in transmission. If only part of the message is received the receiver will time-out and then transmit a “NACK” (for not acknowledged) back to the sender, who responds by re-transmitting the same window of datagrams again. This demonstrates that the time taken for a transfer without error (T1) is significantly less than a transfer with an error (T2) due to the value of the timeout and resend of the packet.

HTTP is very useful for delivering a contiguous feed of video, as out-of-order or missing video presents significant challenges for playback and will degrade the viewer experience. However, there is overhead associated with TCP that can lead to increased network traffic and increased latency between the streaming-server and player device.

Other protocols at the transport layer can eliminate much of the overhead, reducing traffic and latency, but may introduce problems with data integrity or may not be a suitable protocol for transiting the public internet.

HTTP is Ubiquitous

HTTP live streaming formats delivered over TCP/IP make up the vast majority of internet video streaming traffic, but it is important to note that there are a number of important innovations that are currently in progress that may change this, and some other proprietary video deliver formats are being replaced.

Other protocols do exist such as RTMP (Real Time Messaging Protocol) and webRTP (web Real-Time-Protocol). Traditionally, RTMP was used for Flash based viewing experiences, but its use has declined as delivery networks have increasingly looked to leverage a common infrastructure for all internet traffic. Furthermore, Flash has been deprecated from many viewing environments in recent years.

UDP (User Datagram Protocol) based delivery protocols can be leveraged to eliminate some of the latency that can plague OTT applications, but to date, these have not been widely implemented. UDP presents difficult scaling challenges for most traditional broadcast media delivery. And for the foreseeable future, HTTP delivery will likely represent the lions-share of live video streaming.

Natural Limit of Services

IP datagrams are presented to a streaming-server through a limited number of physical connections, usually Ethernet. Consequently, each IP packet will be processed in turn and due to the limitations of the hardware, only a certain number of HTTP messages can be processed within a fixed time. Therefore, there is a natural limit on the number of player-devices that a single streaming-server can service.

One solution is to increase the network speed into the streaming-server and the hardware resources available such as CPU cores, memory and disk space. But at some point, this will also reach a natural limit and is simply not scalable.

Another option is to load-balance the HTTP traffic. A load-balancing device directs HTTP streams to a series of streaming-servers, all containing the same content, but giving the impression to the player that it is only communicating with one server. Although this method makes the servers scalable, it is still limited by the network capacity going into the datacenter servicing the OTT streaming service.

Distributed Servers

A third option is to distribute servers containing the same media closer to the location of the player devices. This has two advantages; firstly, the model is completely scalable, and secondly, latencies associated with TCP distribution are greatly reduced.

For on-demand streaming the distributed server model works well as movies can be sent to the distribution servers ahead of publication. But for OTT live delivery, the streaming servers act as relays and will buffer the live stream from the central playout server adding a layer of latency.

This method of distribution, that is one that caches requests and fulfills responses closer to the client, is exactly what content delivery networks (CDN) provide, with extensive edge cache facilities designed to handle internet scale http delivery.

Leveraging CDN’s and content optimized HTTP cache strategy is a fundamental component of delivering internet scale video to OTT audiences, and represents one of the biggest factors that can impact streaming costs and the viewer experience.

Diagram 2 – The top diagram demonstrates the traditional “slow” method of media streaming. A centralized server will maintain HTTP/TCP/IP connections with many playback devices throughout the world causing the server and network to soon reach the limits of their capacity. The lower diagram demonstrates a CDN system where edge-servers are distributed throughout the world closer to the playback devices. There is less load on the live-playout-server as fewer devices demand data streams from it, TCP latency is lower as ISP backbone networks can be used, and more edge-servers can be added as audiences grow to achieve greater scalability.

Diagram 2 – The top diagram demonstrates the traditional “slow” method of media streaming. A centralized server will maintain HTTP/TCP/IP connections with many playback devices throughout the world causing the server and network to soon reach the limits of their capacity. The lower diagram demonstrates a CDN system where edge-servers are distributed throughout the world closer to the playback devices. There is less load on the live-playout-server as fewer devices demand data streams from it, TCP latency is lower as ISP backbone networks can be used, and more edge-servers can be added as audiences grow to achieve greater scalability.

Many modern architectures now include just-in-time packaging (JITP) on a resilient origin streaming server hosting the live content. This is a multi-CDN approach with optimized caching layers that scale on-demand as audience size grows.

Robust monitoring systems placed throughout the delivery chain quickly identify which service providers are responsible should there be a degradation of service. Multi-CDN’s usually rely on multiple service providers collaborating to provide a seamless and enhanced viewing experience for the audience.

Just in Time Optimization

With JITP support, content owners can dynamically structure the package formats that are delivered to the audience, enabling stream optimization, content personalization and protection schemes, and other responsive late-binding workflows to create a more engaging viewer experience.

Hosting the content origin can enable a more cost-efficient way of delivering the same programming through multiple network paths and provides the broadcaster with deep analytical data for content and network performance.

Implementing a multi-CDN strategy ensures that the broadcaster is not reliant on a single delivery path to reach the audience. This enables the broadcaster equipped with the right analytics the ability to dynamically move audience members in to the most cost and performance optimized delivery path.

All these strategies represent a challenge that require deep reporting and real-time analytics. These strategies enable the broadcaster to measure how changes in the production and delivery chain impact audience quality of experience and can be immediately optimized to create the most compelling viewing experience possible.

Adjustable Bit Rates

Viewers on the move experience differing network data rates and quality as they transfer between cells. Therefore, most modern streaming platforms deliver adjustable bitrate (ABR) packages of content. HTTP streaming creates short segments of video and sends them to the player device in bursts. But if these segments are replicated, then they can be transcoded at different bit rates and resolutions.

ABR provides a range of encoding and processing settings to produce packaged content optimized for playback across a wide array of device and network playback conditions. ABR packages are presented to the player using a manifest file that acts as a playlist, telling the player where it can go to retrieve the next segment.

Algorithms running in players detect when their buffers are running low, so the player can opt to an adjacent bitrate variant. Other approaches include the ability to identify only those variants that are ideally presented on a given device, or in each network condition. For example, if my mobile device is on a cellular network, I may choose to omit the highest resolution variants as this would not improve the viewing experience while jeopardizing smooth playback.

HTTP plays an important role in distribution of OTT over the internet. Even though there is currently much research for an alternative, HTTP is going to be the prominent distribution method for the foreseeable future. Multi-CDN builds on HTTP to provide scalability and help broadcasters deliver a better viewer experience. In the next article, we will look at monitoring and why it’s critical to OTT and multi-CDN systems.

Part of a series supported by

You might also like...

Microphones: Part 4 - Microphone Technology - The Diaphragm

Most microphones need a diaphragm in order to follow some aspect of the air motion that carries the sound.

IP Security For Broadcasters: Part 5 - NAT Explained

When IP was first envisaged back in the 1970s, just over 4 billion unique IP addresses were allocated. However, the overwhelming international adoption of the internet with a world population of nearly 8 billion people has demonstrated there are simply not enough…

Standards: Part 24 - Timed-text & Subtitles Overview

Carriage of timed-text must be closely synchronized to the AV stream to ensure it is presented in a timely manner so here we describe the standards that enable this for both broadcast and internet delivery.

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.