Future Technologies: The Future Is Distributed

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with how distributed processing, achieved via the combination of mesh network topologies and microservices may bring significant improvements in scalability, flexibility and resilience.

Monolithic software architectures have one fundamental flaw: they’re very difficult, if not impossible, to reliably scale. But the advance in distributed processing that is increasing in prominence is greatly improving scalability, flexibility, and resilience.

Although we may well make grand sweeping generalizations like “it’s difficult to scale monolithic designs”, it’s worth reviewing why this is the case. As an example, if we consider a video standards-rate converter that converts between US 29.07fps and Europe’s 25fps, then this can be achieved using a COTS server. In the monolithic sense, the standards converter software would run on the server. Using an API interface, the standards converter could be executed, and the output file would appear in the necessary folder.

Monolithic Limitations

This is all well and good when processing a small number of files, but we soon reach the limits of the COTS resource, and it can no longer process the input files within a reasonable amount of time. The obvious solution is to increase the server’s own resource by adding more memory, disk space and even processor cores. But no matter how much resource we increase the COTS server by, at some point we will exhaust the server’s capacity. We may even reach the limits of the operating system.

Using the monolithic and single server architecture, we also run the risk of increasing the number of single points of failure as the standards converter will be running on one server which makes it susceptible to hardware failure. One solution to this is to provide a main and backup workflow where they mirror each other to provide the necessary resilience, which has been the traditional operation for broadcasters. This methodology has stood the test of time and has many merits, but there is now a much better way of operating, and that is through mesh networks, which leads to distributed processing.

Mesh Network Flexibility

A mesh network has no central connection point and each node within the network must connect to at least one other node. In networking terms, a node can be thought of as either a server with multiple NICs, or a network switch. Using multiple links between the nodes improves resilience massively as a single point of failure has only a limited localized effect on the whole network.

If we use multiple COTS servers to host instances of the standards converter software, then we have at our disposal a network of standards converters instead of just one. Yes, we could just increase the number of servers in the pool of the monolithic architecture, but without detailed network planning we run the risk of network bottlenecks and congestion, which not only reduces resilience, but has the potential to greatly increase latency.

Using a mesh-type network will also allow broadcasters to build very low latency and highly resilient localized network pools. If the standards converter is being used in the ingest area and editing suites, then it makes sense to keep the servers as close to the physical workflows as possible. This reduces the need for centralized and very costly network switches that connect all the devices together. In the ingest area, any incoming feeds may need to be standards converted as soon as they enter the building, so it makes sense to keep the servers close to the incoming lines. The edit suites may also need access to the standards converters which may well be on the other side of the building. Keeping the servers close to the workflow reduces the need to have high capacity and low latency links to a centralized switch.

Dynamic Bandwidth Allocation

Mesh networks do not have to operate with the same capacity on every link, and with adequate software control, the multiple links can be aggregated together so that short-term high-capacity links can be established when needed.

Although this method of operation is a significant improvement on monolithic architectures, it could be argued that all we’ve really done is to duplicate servers running specific software to different areas of the building and connected them using a highly resilient and configurable mesh network. In this scenario, the intelligence of the design is in the mesh network, but we can go one step further with the COTS server technology, and that is to use micro-services, or containers, as they are often called.

Containers can be thought of as sub systems running on COTS servers that have the option, but not the obligation, to connect to other containers operating on the same server. They differ from Virtual Machines as a VM hosts its own operating system, but all the containers running on a server use the same operating system but with localized environments. A standards converter application could run in one container on a server within its own localized environment, and a color corrector might run on the same server but in a separate container. The container’s image can be copied to multiple servers and with floating license keys can be enabled only when and where it is needed.

Functionality Where Needed

The idea of containers, combined with intelligent mesh networks and floating licensing provides unparalleled resilience and flexibility for broadcasters. A standards converter container can be deployed to multiple servers all over the network and with the floating license can be enabled only when and where it is needed.

These methodologies further add to the concept of software defined infrastructures where we treat the whole hardware processing and signal routing estate as a definable matrix of resource. Software applications can be placed at multiple parts of the facility with integrated control systems connecting and orchestrating the workflows. Using software licensing, whole workflows and pipelines can be configured on-the-fly with as much resilience as the broadcaster needs for that required workflow.

Resilience also becomes flexible as we no longer think of resilience in binary terms. Different levels of resilience, reflecting the amount of risk the broadcaster wants to take can be applied to different parts of the workflow. As increased resilience is often proportional to cost, broadcasters have the option of increasing or decreasing resilience as required. For example, a playout system will need the highest resilience due to the live nature of the high value media that is passed through it, but an offline edit suite won’t need as much resilience as the system isn’t in the live production chain.

Connecting Together

Although this method may seem like the panacea of broadcast infrastructures, there are some quirks that make it very complicated. For example, how does a broadcaster manage the licensing operation? How does the network know where to route flows and aggregate links? And how do the individual apps running in the containers all communicate with each other in a broadcast infrastructure? We need to tell standards-coverter-1 to process the feed on incoming line 2 and deliver it to file location 3.

There are several solutions that are emerging that do solve these challenges with varying degrees of success. Some of them are proprietary, some are vendor agnostic. NVIDIA’s Holoscan is one example of an open system as NVIDIA are providing an open framework that can be added to as the broadcaster increases the required resource. Vendors can add their own applications to the ecosystem and as NVIDIA have a track record of working in critical systems such as medical and finance, the broadcaster has the best of both worlds; a highly resilient system without vendor lock-in.

Monolithic designs with their main and backup workflows have been at the core of the operation of most broadcasters throughout the world. However, to truly leverage the power of IP and its related asynchronous infrastructures, we must take a closer look at distributed systems that not only improve resilience and flexibility but are also laying the foundations for scalable broadcast infrastructures that can be expanded as the broadcaster sees fit.

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.

Building Software Defined Infrastructure: Part 1 - System Topologies

Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…