Broadcast For IT - Part 16 - Video Compression
To deliver efficient media solutions IT engineers must be able to communicate effectively with broadcast engineers. In this series of articles, we present the most important topics in broadcasting that IT engineers must understand. Here, we look at compression, why, and how we use it.
Television signals continually consume copious amounts of data. Typically, an HD transmission distributed over an SDI network will consume data at a rate of 1.485Gbits/second. A progressive HD signal will use 2.97Gbits/sec and 8K signals use more than 100Gbits/sec.
Baseband is the term used to describe a signal that leaves the camera uncompressed. This is a difficult description as all television signals are compressed by the camera at the point of capture. That is, we take an infinitely varying scene and sample it at 29.97 or 25 frames per second. And then split the light into red, green and blue channels ready for digitization and quantization.
Each of these steps is technically a form of compression. However, for historical reasons, a signal that has gone through this process without further bit rate reduction is referred to as “baseband”.
Constant Bit Rate is Evenly Gapped
Furthermore, video signals are inherently periodic. The frame rate provides regular data at 29.97 and 25 Hertz. This is evened out by SDI to provide a constant bit rate (CBR) of data that is evenly gapped throughout each second.
Broadcast engineers prefer to work with baseband signals as they provide the least degradation and concatenation errors when processing.
Compression is the process of taking a baseband signal and further reducing the bit rate without noticeable degradation of vision or sound. More advanced systems change the distribution to variable bit rate (VBR) to create bursty data distributions and improve efficiency over packet transport networks.
Compression is a Three Step Process
Video signals use three methods to achieve compression; removing surplus synchronizing information, intra-frame, and temporal reduction.
In modern workflows, a video signal may be compressed and uncompressed many times throughout the broadcast chain. Depending on the compression used, this process can be highly destructive and leads to the concept of concatenation error. Video compression tends to be lossy, so continually compressing and uncompressing a lossy signal leads to excessive noise and distortion.
In highly optimized systems, concatenation may not be immediately obvious but will be seen when one too many compression and un-compression cycles are completed, leading to the cliff-edge effect.
Removing Syncs Reduces Bit Rates by 20%
Due to the historic operation of broadcast systems, much of the signal uses line, frame, and field synchronizing pulses. Typically, a line pulse in SDI systems uses nearly twenty percent of the signal bandwidth. This was an overhang from the NTSC and PAL systems where long pulses were needed to reduce the amount of back-emf induced in the electromagnetic scan coils of a cathode ray tube television.
True digital systems can dispense with these pulses and replace them with unique code values. A saving of twenty percent of the video data-rate can be easily achieved using this method.
Once the sync information has been reduced, a compression system will further reduce the bit-rate using intra-frame reduction. This method starts by dividing each frame into 8 x 8 blocks of pixels and then performing a specialist form of Fourier transform on them called the DCT (Discrete Cosine Transform).
Discrete Cosine Transforms
The DCT algorithms provide coefficients that represent discrete frequencies within each 8 x 8 block. Due to the cyclic nature of the images we see, patterns of closely related values appear over the screen. Constraining these amounts to fit within pre-defined values starts the compression process. Applying coarse limits and normalization parameters leads to greater compression.
Diagram 2 – In intra-frame compression, DCT functions are applied to 8x8 pixel blocks to convert from the spatial domain to frequency domain.
After intra-frame compression is complete, the compressor goes onto temporal, or motion compensated compression. This method looks for recurring patterns between subsequent frames and instead of sending similar pixel values for consecutive frames, will send just one value for several frames.
Motion compensated processing (MCP) is extremely complex and can analyze up to 60 frames simultaneously to determine common motion between frames to send vector representations instead of absolute pixel values. This can best be seen when a ball is thrown into the air.
Vector Representation is More Efficient
Motion compensation will analyze each frame in turn and be able to pick out the ball. Instead of sending the pixel values of the ball, the MCP will provide a vector representation of the ball-object. In the extreme, and with the correct picture content, this provides a great deal of bit-rate reduction.
MPC only generates data when there is movement in a scene and further adds to the burstiness of a stream to create VBR data and enhance distribution over packet networks.
MPEG - Motion Picture Expert Group
Broadcasters have been using digital compression since the early 1990’s. MPEG (Motion Picture Expert Group) provided the first commercial systems and MPEG-2 was adopted, it’s still used today in satellite and terrestrial transmissions, but advances in MPEG4 and HEVC (High Efficiency Video Coding) are starting to make an impact.
MPEG-2 has two fundamental forms of operation; intra-frame and inter-frame. Intra-frame is the frame only compression using DCT’s on the 8 x 8 pixel blocks. This is often referred to as “I-Frame” compression. It’s often used in archiving and editing as the type of compression provided reduces the possibility of concatenation errors found with multiple edits.
High Compression Efficiency
Inter-frame is the process of motion compensation. Although this can provide some fantastic compression ratios, it is also highly destructive of the original video signal. Concatenation errors soon appear in the form of picture stutter and break-up if the available bit-rate is heavily restricted, especially with scenes that have fast, high-dynamic movement.
MPEG-2 Inter-frame is usually reserved for transmission to home viewers as the picture will only need to be constructed once.
Diagram 3 – Inter-frame compression achieves lower bit rates using “B” (bi-directional) and “P” (predictor) frames to apply motion compensation and greatly reduce data between I-Frames, this comes at the expense of concatenation errors during multiple compression and de-compression cycles. This is less evident in Intra-frame only compression, but at the expense of higher bit rates.
Group of Pictures (GOP) is used to describe the construction of a motion compensated MPEG-2 stream. I-Frames are used as an anchor frame to provide a reference for each of the compressed inter-frames. Without it, we would just see partially formed images forming over many seconds. Terms such as 12-GOP and 30-GOP describe the number of I-frames for each compressed inter-frame configuration.
Null-Padding is a Cheat
Although GOP streams natively create VBR data, they can be configured to create CBR streams. In this instance, a CBR stream is just a VBR stream with Null-Padding data added to it. In effect, this creates wasteful data and selecting the correct version is the responsibility of the engineer configuring the system.
A system that compresses from one format to another, for example 525i29.97-12GOP at 25Mbits/sec to 525i29.97-25GOP at 5Mbits/per sec, is referred to as “transcoding”. Modern transcoders tend to be software applications as they either process files, or live streams provided on IP/Ethernet links.
Codec’s or Transcoders?
Converting from SDI to a compression format, and then back to SDI, is described by the term “codec”. They tend to be hardware devices as specialist electronics is used to convert the SDI signal to data streams that can be processed in software. SDI-PCI cards are now found in computer servers allowing x86 type computer architecture to be used as codecs. However, specification of the server is critical due to the high bandwidth data channels needed to move between the SDI card, CPU, memory, and disk storage.
Video compression is a highly specialized discipline and many of the controls are inter-dependent. Modern transcoders provide many outputs of differing bit rates to meet the requirements of delivery to multiple devices such as cell phones and tablets, as well as traditional televisions. Migrating to IP is accelerating the need to understand and deliver highly optimized video compression systems.
You might also like...
Designing IP Broadcast Systems - The Book
Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…
Demands On Production With HDR & WCG
The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.
If It Ain’t Broke Still Fix It: Part 2 - Security
The old broadcasting adage: ‘if it ain’t broke don’t fix it’ is no longer relevant and potentially highly dangerous, especially when we consider the security implications of not updating software and operating systems.
Standards: Part 21 - The MPEG, AES & Other Containers
Here we discuss how raw essence data needs to be serialized so it can be stored in media container files. We also describe the various media container file formats and their evolution.
NDI For Broadcast: Part 3 – Bridging The Gap
This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…