Why We Need OTT Monitoring – Part 2
In the previous article in this series, “Understanding OTT Systems”, we looked at the fundamental differences between unidirectional broadcast and OTT delivery. We investigated the complexity of OTT delivery and observed an insight into the multi-service provider silo culture. In this article we fully analyze a typical OTT delivery channel to understand why we need monitoring.
Even delivering an OTT unicast program to a mobile device or player requires a bi-directional data path between the broadcaster and the viewer. And this is one of the reasons OTT can provide many challenges for broadcasters, especially when multiple service providers are involved.
There are five distinct technical areas in the OTT delivery chain; content provider or broadcaster, content preparation and origin services (also known as the headend), private or 3rd party CDN’s, Mobile/Wi-Fi/Broadband access points, and the user device (laptop, mobile phone, notepad).
The broadcast is made available by the content provider, but it is the user that initiates the transmission and delivery.
OTT Format
Content leaving a broadcaster is first encoded and transcoded into multiple streams required by DASH (Dynamic Adaptive Streaming over HTTP) and then segmented and packaged. The segmentation takes a continuous stream of video and audio, and partitions it into small HTTP-based files with each segment representing a short interval of playback. The mobile or playback device assembles the segments to reconstruct the video and audio.
Packaging includes creation of the manifest files to describe the streams, and construction and compliance with the DASH standard. Other protocols such as HLS (HTTP Live Streaming) operate in a similar manner to create segmentation and description files.
OTT Segments Filestreams
The concept of segmentation is important to broadcasters as it demonstrates a break-away from the traditional RF transmission and cable type distribution and infrastructures. These rely on a unidirectional, continuous data-stream sent to the user. Segmentation, as the name suggests is separating the continuous program data-stream from the broadcaster into individual files that can be distributed. These are transferred not only on an IP network, but more importantly, on an HTTP compliant network.
HTTP compliant networks, such as the internet, allow users to watch OTT programs on their playback and mobile devices.
After segmentation and packaging, the resulting files, or chunks, are sent by the headend to the CDN, or if multiple CDN’s are used, the chunks can be stored in an origin server within the headend. If only one CDN provider is needed, the headend can send the files to the CDN which in turn distributes and stores them on the edge servers as required.
Playback Requests Data
Again, this is where traditional broadcasting differs from OTT as it is the responsibility of the playback or mobile device to request the segments and manifest files from the origin server or CDN edge server.
Figure 1 – A playback chain consists of five distinct points of demarcation, but even these can be further divided by multiple vendors.
A playback or mobile device works on the request-and-supply model. When a viewer selects a media streaming website, the media player within the viewers device tries to maintain an optimum level of buffer capacity. If the buffer is too low, the playback engine runs out of data and the pictures and sound will break up. If the buffer capacity is too high, then unnecessary latency occurs.
No HTTP Multicast
Although multicast transport protocol research is now a hot topic, currently, there is no provision for multicasting in HTTP networks such as the internet. As HTTP uses TCP, and TCP establishes a software logical connection between the playback and mobile device, and the streaming server, there is in effect a direct connection between them resulting in a one-to-one mapping. This restricts the ability for the internet to provide a one-to-many mapping as in traditional RF and cable transmissions.
Also, web-servers are stateless. That is, when a browser requests a web page, it ultimately includes in the request the IP address of the web-server and the IP address of the player and mobile devices to establish a connection. When the web-server receives the request for data, it responds with the two IP addresses reversed. Once the data has been sent, the webserver forgets that the requesting player or mobile device exists. In other words, it does not hold any information about which files, or how many, have been sent to the browser, hence it is stateless.
Web video and audio streaming operates in the same manner. It is the responsibility of the playback or mobile device to continually request the next segmented package of files from the server nominated by the headend.
Mobile Device Initiates Streaming
Understanding the demand-and-supply nature of playback and mobile devices is critical to understanding the reason for using CDN’s.
As an example, consider a single stream distribution network without a CDN solution; a viewer wants to watch a baseball game on their mobile phone. They log onto the website through Wi-Fi via an ISP, and the playback engine in their mobile phone starts requesting segmentation files from the headend origin server. The origin server is constantly being fed with new segmentation files from the studio output, encoder, and packaging engine.
The origin server has no knowledge of the existence of the viewer or their mobile device so does not actively send them any video or audio. Without any viewers the origin servers’ storage would just fill with segmentation files (after some time, they would be purged by a housekeeping program) and they wouldn’t be sent anywhere. The origin server relies on the mobile device actively requesting the next segmentation file in the program sequence to fill the mobile phones’ buffer and playback the game.
Figure 2 – A load balancer distributes segmentation requests so that load from many devices can be spread across multiple servers. As the servers are “stateless” they only need to respond to each request one at a time and do not need to hold any history of the previous requests.
Even in our simple example, it’s possible that the ISP may buffer the files in a proxy server somewhere within their own network. Neither the broadcaster nor the viewer would have any knowledge of this. Such a configuration should be transparent. However, it could increase latency or be another point of failure.
Assuming the network is behaving properly, the origin server would happily service segmentation file requests from the single mobile device. However, if multiple users started to also watch the game and request segmentation files, a tipping point soon occurs where the origin server does not have the internal resource to service all the file requests and will slow and ultimately become unresponsive. This results in the user receiving a poor quality of experience, that is, they would no longer be able to see their game.
CDN Alleviates Network Congestion
One solution is to increase the number of origin servers and add load-balancers to them so that many playback and mobile devices can request files simultaneously. But this makes inefficient use of the network bandwidth. If many of the users reside in one geographic location, then one data stream per user exists, resulting in efficient use of the network and potentially causing sever congestion. Especially as the data carried over the network for each mobile device will be virtually identical.
The CDN alleviates the network congestion issue. Multiple streaming servers are strategically placed in geographical areas within the vicinity of the users. These servers are called edge servers. This server distribution model decentralizes computer processing, significantly reduces network traffic as there are fewer devices requesting files from the headend origin server and reduces the load on the origin server at the headend.
If only one CDN provider is used, then the broadcaster may dispense with their own origin server and send the segmentation files directly to the edge servers within the CDN network from the broadcasters headend.
International Events
If the broadcaster is distributing an international event, such as the Olympic games, then many users in countries throughout the world will want to view the program. It’s unlikely that one CDN provider will service all countries and each country, or group of countries, will have their own preferred CDN provider. With this model, the broadcaster will go back to providing an origin server, or group of origin servers, to facilitate the demand from the edge servers for each countries’ CDN.
What started as a simple OTT example has suddenly become incredibly complex with a whole multitude of service providers throughout the world. As the network grows, the broadcaster has less control and influence. But they are still responsible for making sure the viewer has an outstanding quality of experience. If they do not, then the viewer may switch channel, or take to social media to vent their anger, thus potentially damaging the broadcasters’ brand and revenue.
Viewers Demand Reliability
As well as understanding a highly complex network, the broadcaster now has an indeterminate number of dynamic challenges to also deal with, and potentially many service providers blaming each other for problems. The geographical location of faults heavily impacts on the number of users affected. The viewer doesn’t care whose fault it is, they just want to watch their program.
ISP, CDN, headend, intermediate streaming servers, and edge servers all interact with each other dynamically and sometimes unpredictably. The law of unintended consequences might imply a fault is occurring at a location unconnected with the point in the network where the problem is manifesting itself. A fault might even randomly appear to move around a network between different service providers.
Consequently, as OTT has increased in popularity, its complexity has risen exponentially. And understanding what and where to monitor is critical for a broadcaster to maintain their brand and revenues.
In the next, and final article in this series, we look at how the law of unintended consequences can seriously impact a broadcast, and where we need to place our monitoring, and why.
Part of a series supported by
You might also like...
A New Year Speculation On Immersion
As we head into another new year it seems ok to indulge in some obvious speculation about what the future may bring. Here we consider the proposition that eventually, and probably not far into the future, broadcasters will have to…
Microphones: Part 4 - Microphone Technology - The Diaphragm
Most microphones need a diaphragm in order to follow some aspect of the air motion that carries the sound.
IP Security For Broadcasters: Part 5 - NAT Explained
When IP was first envisaged back in the 1970s, just over 4 billion unique IP addresses were allocated. However, the overwhelming international adoption of the internet with a world population of nearly 8 billion people has demonstrated there are simply not enough…
Standards: Part 24 - Timed-text & Subtitles Overview
Carriage of timed-text must be closely synchronized to the AV stream to ensure it is presented in a timely manner so here we describe the standards that enable this for both broadcast and internet delivery.
HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows
Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…