Understanding IP Networks - Forward Error Correction
In the last article, we looked at the monitoring packet delay in real-time. In this article, we continue the theme of looking at a network from a broadcast engineers’ point of view so they can better communicate with the IT department, and look at FEC (Forward Error Correction).
IT networks are designed assuming there will be packet loss and delay during transmission. Transmission Control Protocol (TCP) and File Transfer Protocol (FTP) resolve these problems by using methods to guarantee delivery of packets.
TCP Guarantees Delivery
TCP uses a windowing system to send a group of packets to the receiver, a data validity check is performed at the receiver and either an ACK (acknowledge) or NAK (not-acknowledge) is sent back to the receiver. If an ACK is received the next group of packets in the sequence is sent. If a NAK is sent, or the sender times-out then the original group of packets are resent.
Using this method, TCP guarantees either the file was received or not by the destination computer. The disadvantage of this system is that it is relatively slow compared to the maximum line speed available. The data transfer rate is limited by the processing speed of the receiving computer, thus slowing data rates even further. Increasing window sizes will improve data transfer rates at the expense of delay.
Fire and Forget
Delay is introduced by the windowing system and increasing the window size will improve throughput at the expense of more delay. TCP is a balancing act between delay, data throughput and data validity. Broadcast engineers need a network where the delay, throughput and data validity are predictable and reliable. TCP simply cannot provide this.
UDP (User Datagram Protocol) is the most basic of our internet protocols and sits within an IP wrapper. UDP enhances the IP datagram by adding port numbers to the IP addresses, giving multiple applications access to the same IP address, or computer. UDP datagrams are treated as a fire-and-forget protocol, that is the sender cannot be sure the datagram was received intact. However, UDP does provide two important functions for us; it operates at near line speed, and there is no protocol delay.
Two of our requirements are satisfied by UDP; delay and throughput. Data validity is a problem as there is no ACK, NAK or timeout within the protocol. Drawing a comparison to SDI we know that it also has no ACK, NAK or timeout, however SDI assumes the network is robust and suffers little data corruption.
Packet Loss Catastrophe
Furthermore, UDP and IP packets have a checksum built into their header. Switches and routers monitor these and will discard any packets with a checksum error. Quite often a video decoder will not even receive a packet to determine if it is corrupt as the switch will have already discarded it.
Lost packets can be catastrophic for compressed systems. Large GOPs (Group of Pictures) rely heavily on sequences of video frames being intact, if one part of the frame is corrupt it can distort the picture many frames before and after it. Instead of seeing a sparkle in SDI distributions, we see picture breakup and blocking over many frames, even up to a second in extreme cases.
Hamming Codes
SMPTE have been rolling out their 2022 standards since 2007. Realizing early on that IP was going to be a formidable player they have been looking very closely at how we can make our synchronous distribution systems work with IT’s asynchronous networks.
SMPTE2022-1 “Forward Error Correction for Real-Time Video/Audio Transport Over IP Networks” tackles the data validity issue so we can reliably distribute video and audio over IP.
FEC has existed in telecommunications for many years and is based on the original works by the mathematician Richard Hamming who invented the first error-correcting code in 1950, known as the Hamming code. The XOR gates feature heavily in the implementations of this as they form a reliable and simple computational method for determining parity within data words and the ability to fix errors.
Reduce Signal to Noise Ratio
Forward error correction describes where in the system errors are detected and corrected. The sender has no knowledge of whether an error occurred during transmission, but instead assumes some errors will occur so inserts FEC information into the stream allowing the receiver equipment to detect errors, and if they are not too complex and are within the correction power of the system, that is the region where the input errors are greater than the output errors, they can be fixed.
Diagram showing the matrix layout of FEC detection and correction. If only one dimension was used then only single packet loss in the window would be detected. Using two dimensions, that is row and columns, many more packet losses can be detected and corrections made, for example, if P2 and P3 fails then it can be detected and corrected by FEC packets ROW1, COL2 and COL3.
Like most error correction systems, FEC works by adding extra data bits to the information being verified. This looks like we are reducing the data throughput of the system as we are creating redundant bits that do not form part of the video or audio streams. However, adding error correction to a system allows us to take advantage of a reduction in the signal to noise ratio. In networks this results in higher line speeds as we can send more bits down a cable, in effect increasing the bandwidth and data throughput.
High Signal Throughput
SMPTE2022-1 FEC provides row and column correction at the packet level to form a two-dimensional matrix. The specification allows engineers to vary the size of the matrix to provide differing levels of detection and correction. Adding more rows and columns improves error correction at the expense of increased data overhead and processing needed in the decoder. Engineers will need to fine tune their FEC parameters to determine the best compromise for their network.
The specification limits the matrix to 1 – 20 columns and 4 – 20 rows, and the product of rows and columns must be less than 100, resulting in a relatively small matrix. This is fine for correcting short bursts but will cause problems if significant congestion on the network occurs.
In summary, FEC allows some tolerance for data loss and corruption without having to provide a reverse tally connection to signal data validity, thus maintaining high data throughput and integrity.
You might also like...
HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows
Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…
IP Security For Broadcasters: Part 4 - MACsec Explained
IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.
Standards: Part 23 - Media Types Vs MIME Types
Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.
Building Software Defined Infrastructure: Part 1 - System Topologies
Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…
IP Security For Broadcasters: Part 3 - IPsec Explained
One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…