Video Compression is About to Get Faster, Higher, and Stronger
As the amount of content and additional ways to delivery it both increase, compression will remain a fact of life.
IP is taking the broadcast world by storm, and bandwidth considerations, coupled with a drive for an increasingly better picture resolution, are taking center stage. Being able to push an ever-higher number of streams down a 10Gbps (or even 1Gbps) fiber line means that fewer lines need to be leased and that remote locations with a decent internet connection can partake in what is most important for the broadcast sector: contribution, remote production via IP, and offline video editing/production.
Compression is The Way to Go
IP links have a nasty habit of offering fixed bandwidths, with 10Gbps and 1Gbps being the most widespread offerings at the moment. Knowing that an uncompressed HD-SDI signal has a bandwidth of 1.57Gbps when encapsulated as SMPTE ST2022-6, dividing 10Gbps by this figure reveals that only six HD-SDI camera signals (streams) can be transported simultaneously. With 4K (or even 8K) and HDR on the horizon, the number of concurrent streams dwindles.
A compression ratio of 4:1 can turn a 48Gbps ultra-HD 300fps High Frame-Rate video signal—which may come sooner than expected by most—into a 12Gbps stream, making its use on future 100Gbps networks far more reasonable.
The combination of meeting bandwidth constraints while preserving image quality is called “mezzanine compression.” A mezzanine compression protocol must allow for multiple encoding and decoding passes as the video goes through the pipeline, as well as for additional processing and editing, without loss. Mezzanine compression ratios typically range from 2:1 to 10:1.
In the light of past experiences with formats like MP3 compression is still widely considered a technology that, although very effective at squeezing data sizes, comes at the cost of an irreversible loss of quality. This is why most image compression standards available today or announced for the near future are either presented as being “visually lossless” or else allow operators to select a lossless encoding/decoding approach.
Nothing should therefore stop even the most exacting broadcaster from at least trying compression on their contribution and remote production video content. Currently, there are several approaches to choose from: JPEG2000 (a.k.a. J2K), VC-2, intoPIX Tico and—drum roll, please—soon also JPEG XS and HT-J2K.
Compression’s Triple Constraint
As mentioned, when offered a sensible approach with no obvious visible effect on quality, most actors on the A/V scene are willing to embrace video compression for reasons of bandwidth and storage space.
The main obstacle to a more widespread adoption of J2K in the broadcast sector has so far been the perceived computational complexity of its underlying block coding algorithm. More and more users are, however, coming around to the idea that its benefits by far outweigh any ostensible drawbacks.
It is important to understand that three factors are at play when a video signal is compressed and unpacked by a codec: Quality, Latency and Bandwidth. These interact in a way that constitutes a triple constraint:
Figure 1 illustrates the no–free-lunch restrictions imposed by compression technology. There will always need to be tradeoffs between quality, latency and bandwidth.
There is always a trade-off—boosting one factor affects at least one of the other two:
- If you want top quality, the codec needs to perform a lot of intricate calculations. This slows the computational process down—so the latency increases.
- If you focus on quality, you probably choose an algorithm that is not overly aggressive. And so the resulting data stream (or file size) is still relatively large and eats up more bandwidth.
- To squeeze 12 HD-SDI streams into a 10Gbps link, and bearing in mind that you still need some “room” for audio and control streams, a stronger compression ratio is required, which is bound to increase latency for reasons of more intense processing. In some cases, this also affects its quality. And we haven’t even touched on what happens during a series of encoding/decoding passes (see below).
Which Compression Codec Should I Choose?
Different vendors will tell you that they have found a way of breaking the triple-constraint barrier, or else that compression ratios (i.e. the quantitative relation between the data rate of the uncompressed video and that of the compressed video) in excess of 4:1 have a noticeable effect on a stream’s quality and on the encoding/decoding latency. They may have a point: JPEG2000 codecs typically have a latency of 80ms (end to end) when working at a 10:1 compression ratio. The Tico codec can deliver an 8:1 ratio with a (selectable) latency of 3~11 lines at decoder/6~18 lines at encoder. A VC-2 codec, for its part, has a latency of 2ms when used at a compression ratio of 4:1.
Unlike MPEG-2 encoding, JPEG2000 induces a one-off compression loss during the first pass, which is good news, because there will be no additional degradation. Plus, the Discrete Wavelet Transform (DWT) principle it employs can be reversed, meaning that—as long as the filters used do their job without truncation—the original picture quality can be almost entirely restored at the receiving end (decoder).
The most popular video codecs currently available are:
- JPEG2000 (J2K): By far the most widely used wavelet-based codec. The standardized filename extension is .jp2 for ISO/IEC 15444-1 compliant files and .jpx for the extended part-2 specifications, published as ISO/IEC 15444-2. Offers a compression ratio of up to 10:1.
- VC-2: A variant of an older codec designed by the BBC some years ago called Dirac, but much simpler and with a much lower latency. Offers a compression ratio of ~4:1.
- Tico (by intoPIX): Lightweight mezzanine codec (SMPTE RDD35), which is supported by the Tico Alliance. intoPIX is also the proponent and co-developer of the JPEG XS technology (along with Fraunhofer IIS), see below.
All of the above technologies have their advantages and drawbacks, but they are certainly broadcast-grade. Given the rapidly rising bandwidth requirements with the advent of the 4K and 8K resolutions, however, throughput is bound to become an issue in the near future. This is precisely why two new approaches are scheduled for roll-out in 2019:
- JPEG XS: New low-complexity codec for professional video production (“XS” stands for extra speed and extra small). Enables interoperability and allows for easy and cost-effective integration into an IP-based infrastructure. Offers visually lossless quality with compression ratios up to 6:1 and supports resolutions up to 8K for frame rates between 24 and 120fps. Note that despite the JPEG moniker, this codec is not part of nor compatible with the more common JPEG 2000 codec.
- HT-J2K: A block coding algorithm that can be used in place of the existing algorithm specified in ISO/IEC 15444-1 (JPEG 2000 Part 1). “HT” stands for high throughput. The objective is a ten-fold increase in throughput, while allowing mathematically lossless transcoding to/from existing code streams and minimizing changes to code stream syntax and features. Over 30x 1080p60 video streams per GPU, 32 lines end-to-end latency, all the benefits of J2K. Will become Part 15 of the JPEG 2000 family of specifications. HT-J2K is an open-standard, royalty-free solution.
The table below lists a number of criteria to consider for the selection of a compression codec for your outfit or institution. Be aware that this paper is aimed at the broadcast sector and therefore chiefly considers contribution, remote production as well as offline video editing and production applications—not distribution, for which other codecs have become de-facto standards to ensure compatibility with end users.
While all approaches under consideration are perfectly fit for use in a demanding professional environment, there is no clear winner. The eventual choice will most likely be between JPEG XS and HT-J2K. And there, users will have to decide whether JPEG XS’s slightly less power-hungry approach is worth the (likely) royalty-based usage model and the certainty of being locked into a proprietary solution whose longevity ultimately depends on the developers’ business success and continuous efforts.
One thing is for sure: the advent of “XS” and “HT” does not put the trade-off rule to rest. It does, however, shift the constraints to a different level. Accepting HT-J2K’s slight reduction of coding efficiency at a time when its latency stands at a mere 2ms, while computing power has long ceased to be at a premium, looks like a price worth paying: backward compatibility, an open-standard technology and a royalty-free model are simply hard to top.
Erling Hedkvist, SVP, Lawo
You might also like...
Live Sports Production: Part 1 - New Sports Production Workflows
Welcome to Part 1 of ‘Live Sports Production’ - This new multi-part series uses a round table style format to explore the technology of live sports production with some of the industry’s leading system designers. It is a fascinating insight i…
Automating HDR-SDR Conversion
Automation seems like an obvious solution but effective conversion involves understanding what the image content is and therefore what the priorities are for how it should look.
Building Software Defined Infrastructure: Virtualization Vs Microservices
How virtualization and microservices differ, and workflows where virtualization and microservices would be used or avoided in terms of reliability, flexibility and security.
IP Security For Broadcasters: Part 8 - RADIUS Network Access
Maintaining controlled access is critical for any secure network, especially when working with high-value media in broadcast environments.
Standards: Part 25 - Designing Client-Side Video Players
Here we chart the historical development of client-side video players, describe the building blocks used to create them and the relevant standards.