Understanding Compression Technology. Part 1.

Over the decades from our industry’s move from analog to digital based technology, codecs (encoders and decoders) have played a key role in the capture, editing, and distribution of video media. The MPEG-2 codec has supported all three tasks. Camera oriented codecs have moved from MPEG-2 to MPEG-4 (H.264). Distribution has followed two paths. Television program distribution continues to employ MPEG-2. Internet distribution immediately went to progressive video encoded using H.264 because the codec is about 2X more efficient than MPEG-2. The newest distribution format is H.265 or HEVC (high efficiency video codec).

While the details of compression change over sequential implementations, there are seven aspects of MPEG-2 and H.264/H.265 technology that are fundamental to compression: Macroblocks, DCT, Quantization, Lossless Compression, Motion Estimation, Predicted Frames, and Difference frames. Part 1 of this article will cover macroblocks, DCT, and quantization. Lossless compression and motion estimation and will be covered in Part 2. Part 3 will detail the critical role of predicted frames and difference frames.

MPEG-2, H.264 & H265 can be compressed (encoded) two ways. The simplest, is intra-frame encoding. Here each video frame is compressed without any dependence on any other frame. (Both the DV and HEVC/AVC-Intra codecs use intra-frame compression.) The more typical form of compression is inter-frame encoding because it is far more efficient. Inter-frame encoding employs an “I-frame” plus “P-frames” and usually “B-frames.” An I-frame is compressed using the same process as is use by intra-frame encoding.

Video frames may need to be chroma down-sampled to 4:2:0 prior to encoding. Low-pass filtering then reduces noise within each video frame. Both types of filtering are mild forms of lossy compression.

Encoding a video frame begins with the macroblock in the upper-left corner. MPEG-2 encoding starts with a video frame partitioned into non-overlapping 16x16 pixel blocks called macroblocks” Assuming 4:2:0 color-sampling, the amount of Red chroma (Cr) data is one-quarter the amount of luminance (Y) data—thus one 8x8 block of Cr data is pulled from a macroblock. Likewise, one 8x8 block of Blue chroma (Cb) data are pulled from a macroblock. Each luminance macroblock is subdivided into four 8x8 blocks thereby yielding a total of six 8x8 blocks. (H.264 employs an adaptive scheme. Small blocks are employed when there is extensive detail while large blocks are used when there are few details—for example an area of blue sky).

A Discrete Cosine Transform (DCT) is then applied to the each of the six 8x8 data blocks. A DCT is 2D Fast Fourier Transform (FFT) that inputs an 8x8 pixel-block and transforms it from the spatial domain to the frequency domain. In other words, pixel information is converted to a mathematical (cosine coefficients) as represented by the grayscale in the figure below.

Image 1: Pixel information is converted into a mathematical representation of cosine coefficients as represented by the grayscale in the figure below.

The 64-coefficients come from an 8x8 block of pixels where the upper-left cell represents the DC (zero-frequency) value. From left-to-right and top-to-bottom these values represent an increase in signal frequency, which we understand to represent an increase in image detail. To this point, encoding has been mathematically lossless.

The next encoding stage, quantization, performs lossy compression. During quantization the coefficient matrix from a DCT is multiplied by a pre-defined “quantization matrix.” (A quantization matrix determines the amount of compression to be applied.) The matrix favors coefficients toward the upper-left of the coefficient matrix. Those coefficients that are out of favor typically become zero and thus generate very little data.

Image 2: The next encoding stage, illustrated above, performs a lossy compression. During quantization the coefficient matrix from a DCT is multiplied by a pre-defined “quantization matrix.

Lossless data reduction, Variable Length Coding (VLC) and Run Length Coding (RLC), follow quantization. Variable length coding identifies the most frequent patterns in the quantized coefficients. They are then represented by codes defined by only a few bits. Less frequent patterns, conversely, are represented by codes that require more bits.

Next, the set of numeric values resulting from VLC are run length encoded. RLC generates a unique code that represents a repeating pattern found in the output from the VLC encoding stage. For example, a “run” of 31 zero values—as seen in the figure above—becomes a unique code to which a repeat count is appended.

All the data from the RLC stage are then stored. Remember that the DCT, quantization, VLC, and RLC must be completed for all six 8x8 pixel-blocks before a new video-block can be processed.

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.

Six Considerations For Transitioning To Cloud Based Video Distribution

There are many reasons why companies are transitioning from legacy video distribution workflows to ones hosted entirely in the public cloud, but it’s not a simple process and takes an enormous amount of planning. Many potential pitfalls can be a…

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…