Future Technologies: Autoscaling Infrastructures

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with a discussion of the concepts, possibilities and constraints of autoscaling IP based infrastructures.

Other articles in this series:

Broadcast facilities have, until recently relied on custom hardware equipment that was dedicated to a specific job. But as more broadcasters adopt IP, then the flexibility and scalability that IP infrastructures offer delivers new and interesting challenges that we’ve not necessarily been aware of before.

One of the overwhelming advantages of IP infrastructures often cited by pundits is the ability of the system to scale. But what do we really mean by scaling? In an off-prem public cloud-type infrastructure we have so much resource available to us that it is unlikely that we will ever use it all, or even get close to using it. Instead, the broadcast systems are governed by their budgets. Even employing a COTS on-prem datacenter, the broadcaster only has a finite amount of hardware available.

Scaling Limitations

When we speak of scaling, even with the marketing hype aside, we are still limited by the physical resource available to us. The difference with virtualized and cloud computing is that we can dynamically allocate the hardware as per the demands of the service, and we assume that we don’t need all the hardware all the time. In other words, we must apply some statistical analysis to the infrastructure design.

We also take advantage of the fact that modern COTS infrastructures can now lend themselves to a multitude of functions and tasks that were not possible even five years ago, both because of the limitation of the technology and the monetization methods vendors were still coming to terms with.

Monetization for vendors has been as difficult to resolve because of some of the technical aspects of moving a synchronous system to an asynchronous IP infrastructure. Traditionally, vendors developed hardware systems that were relatively easy to monetize in that an amount of money was charged for a unit of box, which in turn delivered a specific number of features. Some vendors experimented with software licensing, but in the main, broadcasters paid on a per box model. But this has all changed with IP as software licensing is key to its success. In fact, this epitomizes the meaning of flexibility, especially when vendors have started to provide functionality on a cost per use per hour basis.

Charging Models

So, in looking at how to scale systems we must balance three factors: the business needs, the available hardware resource, and budgets. At first sight nothing has changed as this is exactly what broadcasters have been doing with traditional designs. But the key difference with scalability in the new IP COTS infrastructures is that we have a consistent layer of hardware that can be divided between multiple functions and changed on-the-fly through advanced software licensing methods. The potential for vendors to charge by the second, hour, day or year is possible, and they don’t even have to be limited by time. There are business models that charge on the amount of data processed, thus adding a new dynamic to costing systems.

In essence, when designing autoscaling infrastructures, broadcasters must not only contend with designing reliable workflows, but must also understand how vendors are charging for use of their products and services, and build their own systems to accommodate this. Broadcasters can certainly pay for software services on a per year or in perpetuity basis, but the not-so-hidden costs of software support fees are often significant. There are certainly times when this type of model works well for the broadcaster, especially when the workflow is relatively static, but the whole point of scaling and autoscaling infrastructures is that they respond to the changing demands of the workflow, which in turn should have a positive influence on the costs incurred.

Expanding Infrastructure

Although it may look like we’ve got caught in a trap because broadcasters still need to understand how much resource they need to procure for their COTS datacenters, a major difference from traditional SDI type workflows is that hardware can be easily and predictably added to an existing on-prem infrastructure. It’s much easier to add multiple servers to a rack than add broadcast specific equipment with all its associated coaxial SDI and multicore AES cabling. The general thoughts are that in a COTS environment, the rack should have sufficient power, cooling, and networking for expansion. These are easy words to say and when designing the datacenter. We still need to think about making the system future-proof, except that in the COTS world, the future isn’t ten years away, as it is with broadcast infrastructures, so we can relax our future-proof constraints a lot and amortize the cost of the equipment over a much shorter time.

As well as building reliable workflows in IP and building accounting systems that interface to the vendors costing platforms, the broadcaster must also be able to manage their own IP infrastructure. We may speak with an almost rose-tinted-glasses euphoria about the scalability of COTS hardware systems, but we need to understand deeply how the hardware is being allocated, the available capacity, and networking capacity to transport the media signals. It’s all well and good saying “I have four spare servers that I can run my transcoders on”, but broadcasters must understand how the media signals are getting to and from these devices. Just because there is a network cable going to a server doesn’t mean there is sufficient network capacity to transport the signal to and from the device.

Measuring Uncertainty

The next development for the broadcaster then is to understand not only how their COTS hardware is physically laid out, but how the network operates. COTS IP networks are asynchronous systems that leverage the statistical nature of demand to provide the efficiencies they deliver. This is the principal reason we talk about dynamic workflows. A COTS infrastructure servicing a static workflow will operate effectively and reliably, but whether it’s efficient is a moot point, because we must define what we mean by efficient, and there is no single answer to this. It depends on the specific workflow in question. However, a dynamic workflow where popup channels are a regular occurrence, or a studio that needs different types of graphics creation depending on the production, are perfect examples of how broadcasters can leverage the efficiency and power of asynchronous COTS IP infrastructures. But we must be careful as COTS IP infrastructures also have limits, and they have a nasty habit of slapping us in the face when we’re least expecting them, and this is due to the statistical nature of how asynchronous systems operate.

One method of overcoming this is to have a deeply integrated management, logging and monitoring system. In asynchronous infrastructures, we cannot assume that a network cable that has a link capacity of 10Gbit/s will deliver this capacity all the time to a specific media flow. For example, if a link in another part of the network fails then its flows may be routed through the link in question to maintain the concept of resilience, which will have an impact on its capacity. The same is true for network switches, servers, storage devices, and any other items that make up the COTS IP infrastructure, as with these systems the world is no longer static and instead follows the laws of probability with much greater enthusiasm. We shouldn’t be worried about uncertainty as even static systems comply with the laws of statistics and the central limit theorem, it’s just that their intrinsic averages and variances are much more constrained. A 1.485Gb/s SDI link does not have a datarate of exactly 1.485Gb/s, it has a small variance of plus or minus a few bits per second, but we don’t notice this deviation as the receiver is synchronized to the transmitter and so they move in unison. The same is not true for asynchronous systems. One-hundred percent certainty does not exist, just look at the Gaussian distribution function where we can see that the tails never reach the asymptotes, they get very close as we project to an infinite time into the future (or past), but the curve never meets the axis.

Mesh Networks

Using the spine-leaf type architecture is highly predictable and is analogous to the traditional SDI type infrastructures. But this type of topology is not particularly scalable as we need to understand all the potential workflows that the broadcaster might need in the future. Again, this works and has some scalability, but the jewel in the crown of a truly scalable systems is with mesh networks. With these, we can keep adding resource as needed by just bolting it into the system where it’s needed. A sort of edge-type system on steroids. In contrast, one fundamental issue with the spine-leaf model is that the spine switches are limiting factors in terms of capacity. When we reach the limit of the spine switch, a new one must be procured, just like we did with centralized SDI routers. Yes, you can expand them, but only within a specified limit.

Before we press the auto-scale button on our infrastructure we must understand what is happening in the whole infrastructure, not just specific links, which is generally the case with traditional SDI infrastructures. This is especially the case if we move to mesh networks, which, to build true scalable systems is the way of the future. And then we’ll have to start measuring uncertainty in the infrastructure to be certain of success.

You might also like...

IP Security For Broadcasters: Part 5 - NAT Explained

When IP was first envisaged back in the 1970s, just over 4 billion unique IP addresses were allocated. However, the overwhelming international adoption of the internet with a world population of nearly 8 billion people has demonstrated there are simply not enough…

Standards: Part 24 - Timed-text & Subtitles Overview

Carriage of timed-text must be closely synchronized to the AV stream to ensure it is presented in a timely manner so here we describe the standards that enable this for both broadcast and internet delivery.

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.