Future Technologies: Timing Asynchronous Infrastructures

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with a technical discussion of why future IP infrastructures may well take a more fluid approach to timing planes.

It’s impossible to deny that television is synchronous due to the time-invariant sampling of the video and audio that means the viewers devices must be synchronized to the broadcaster’s acquisition equipment. But that doesn’t mean the bit in the middle must be synchronous.

For television to maintain its illusion of motion, and for the audio to be glitch and distortion free, the playback devices must play the video and audio back at exactly the same rate with which the camera and microphone acquired the vision and sound. If they don’t then the pictures will stutter and the audio will break up and distort, thus destroying the immersive experience. If we were to start television again, it’s unlikely there would be any need for frame, field, and line syncs, but with vision, we certainly need to know where the first pixel in each frame resides. From this reference pixel, we can determine where all the other pixels are located in the matrix video display. But we must synchronize the playback devices frame rate to that of the broadcaster.

Frame Rates Can Just Be Similar

There are times when the acquisition and playback frame rates are not the same, such as with an outside broadcast that has its SPG free running with respect to the studio. Here, we traditionally use a frame synchronizer. We assume that the frame rate at the OB is similar to the studio because the frame synchronizer operates by generating occasional frames and dropping them if the input buffer under or overruns. If the rates are massively different then major discontinuities in the studio would be seen from the OB.

Even today, we still need to be concerned with timing and with the current design of television, there’s no real way of getting around this. But we only really need to be concerned with timing when we look at the pictures and listen to the sound, that is, when we observe the data. Other than that, we can process the video and audio asynchronously and be rid of these archaic timing constraints.

That all said, this doesn’t help us with processing video in asynchronous infrastructures such as on- or off-prem datacenters and virtualized infrastructures such as public cloud providers, because there is one little caveat with asynchronous processing that can’t be ignored. That is, to make sure our pictures and sound can be presented to the viewer in a synchronous fashion, we must use buffers in the processing system. And the more asynchronous components in the system, then the more buffers that need to be used.

Latency Is Inevitable

Buffering is a standard method in asynchronous processes to make sure we don’t lose any data. If we think of an ethernet NIC receiving data from the network into the server then what happens next? The CPU is an asynchronous sub-system within the server for two reasons: it uses an independent oscillator, and the operating system is not locked in any way to the ethernet stream, which would be impossible any how because the network itself is asynchronous. Consequently, the ethernet NIC writes the received data into a circle buffer and either sends an interrupt to the host OS to tell it there is some data to collect, or the host OS polls the buffer to see if there is any data to collect. The buffer has acted as a temporary store so that no data is lost, otherwise it would have to rely on the CPU and host OS to instantaneously react to an ethernet frame when it is received, virtually an impossible task due to the many other functions the CPU and host OS are performing. As an aside, in many real-time IP video processing servers, the server employs a method called kernel bypass where the NIC writes the received buffer directly to the OS’s system memory using a hardware acceleration process called DMA (Dynamic Memory Access). Even so, there are still buffers involved.

Within the precincts of an on-prem datacenter, broadcasters have much more control of the hardware than with a public cloud infrastructure. SMPTE’s ST2110 requires a dedicated timing plane using a method called Precision Time Protocol (IEEE 1588), and when this is correctly installed with the appropriate network configuration, it can provide sub nanosecond timing accuracy. This is all well and good and provides excellent solutions for broadcasters working in studios and OBs who are particularly concerned about latency, but it does require the installation of specific PTP aware NICs and network switches. Although it’s possible to install these in the on-prem datacenter, it’s not a viable option for public cloud solutions.

Dropping Archaic Timing Constraints

We now have two options, do we make all our systems ST2110 compatible and effectively on-prem, or do we remove the need to impose the PTP constraint on the system altogether? The practical answer is that we do both of these.

In a studio where tight timing constraints are required then PTP with ST2110 is usually the go-to method. But again, there are caveats. Some vendors are suggesting that nanosecond timing isn’t needed anymore because we no longer need to be concerned with back EMF’s on the electromagnetic coils driving the electron beams of camera and monitor tubes, so they have relaxed this constraint with varying degrees of success.

In COTS type infrastructure, where for the sake of this argument we include the public cloud, then removing the timing constraint that demands PTP altogether is an absolute necessity. Although some cloud service providers are making noises about providing precision timing in their systems, the real question is why bother? It’s virtually impossible to make a COTS type server synchronous and why would you even try? Such computing systems exist through applications with Real Time Operating Systems where their response time becomes more predictable, but employing such devices would completely obsolete one of our motivations for moving to IP, that is, we can use COTS components.

Asynchronous Is Not New

Can we really remove the timing constraints that television is built on and process all the media signals asynchronously? The simple answer is yes, and several vendors are doing this, again it’s technology in its infancy so we have to be patient with the rate of success of these products. But the huge caveat here and the elephant in the room, is that in processing asynchronously, we must accept latency as a reality of life. Latency has always existed, even in synchronous systems, it’s just a case that the latency was very low. Now, we’re experiencing latencies of tens of seconds for OTT delivery and this may well be alright for this type of application.

The point here is that for IP and COTS to really work for broadcasters, we must stop thinking that latency is binary and bad. Our latency needs vary with our applications. A thirty second delay in an OTT movie service would be acceptable but would be unusable in a studio. We don’t need to use the same infrastructure in the studio as we do in OTT delivery.

This is something relatively new for broadcasting as we’ve traditionally always imposed the same timing constraints throughout the whole broadcast chain – from glass to glass. We no longer need to do this as the needs of the viewer in terms of watching on mobile devices have also changed, and IP with its associated COTS infrastructure is critical in this delivery.

In effect, we need to take a more realistic view of how we treat timing in broadcasting workflows. As long as the pictures and sound are delivered at the correct frame rate and are synchronous to the generating devices within the broadcast workflow, then we can take a more fluid approach to timing planes within IP COTS infrastructures.

You might also like...

Live Sports Production: Part 1 - New Sports Production Workflows

Welcome to Part 1 of ‘Live Sports Production’ - This new multi-part series uses a round table style format to explore the technology of live sports production with some of the industry’s leading system designers. It is a fascinating insight i…

Automating HDR-SDR Conversion

Automation seems like an obvious solution but effective conversion involves understanding what the image content is and therefore what the priorities are for how it should look.

Building Software Defined Infrastructure: Virtualization Vs Microservices

How virtualization and microservices differ, and workflows where virtualization and microservices would be used or avoided in terms of reliability, flexibility and security.

IP Security For Broadcasters: Part 8 - RADIUS Network Access

Maintaining controlled access is critical for any secure network, especially when working with high-value media in broadcast environments.

Standards: Part 25 - Designing Client-Side Video Players

Here we chart the historical development of client-side video players, describe the building blocks used to create them and the relevant standards.