Complexity Barrier Drives Simpler Video Compression

Video encoding is running up against a complexity barrier that is raising costs and reducing scope for further improvements in quality.

This comes at a time of inflation in video quality expectations that is pushing the traditional envelope of video codec evolution where encoding efficiency roughly doubles every 10 years. Partly as a result, there has been fragmentation within the codec world through efforts to increase this rate of progress, but mostly relying on ever greater computational complexity. This is now exceeding the pace of Moore’s Law which states roughly that the density of components on integrated circuits doubles every two years, according to leading codec experts such as David Ronca, director of video encoding at Facebook.

Ronca, who was head of video encoding at Netflix throughout its great expansion from 2007 until moving to Facebook mid-2019, is among advocates finding lower complexity approaches to encoding. One of these new approaches combines more basic versions of existing codecs with a software layer, with the aim of bringing quality back up to the required level. Called Low Complexity Enhancement Video Coding (LCEVC), this has gained traction over the last year because it offers an escape from the runaway train of escalating complexity, while being compatible with existing codecs, as well as new ones under development.

LCEVC has also been shown in tests to be more resistant against the varying network conditions that can blight streaming services with artefacts such as blockiness, breaks and insertion of ghost images. This enables more consistent quality even when overall parameters such as resolution, frame rate and color depth stay the same. Indeed, LCEVC was designed precisely to correct artefacts arising from base encoding by applying additional detail from video transformations in small units of just 2x2 or 4x4 pixels.

Then from the performance perspective, a key point is that LCEVC was designed for parallel processing to reconstruct the target resolution, with the ability to do this on standard components so as they encode the video faster without help from specialized hardware. In this way it improves detail and sharpness of any base video codec, such as AVC, HEVC, AV1, EVC, or for that matter the forthcoming VVC (Versatile Video Coding), being standardized as MPEG-I Part 3.

LCEVC itself has been enshrined as MPEG-5 Part 2 but it has proprietary roots with London based encoding technology developer V-Nova and its Perseus codec, launched in April 2015. V-Nova began with the ambition of challenging existing codecs such as H.264 and HEVC with a radically different approach. But the company soon realized few broadcasters were willing to adopt a non-mainstream codec for distribution of primary services, even if they might do so for video contribution where Perseus originally did gain some customers such as Sky Italia. But on the distribution front, Perseus evolved into a tool to supercharge other mainstream codecs, as a complement rather than alternative. In this guise it has evolved into LCEVC.

David Ronca, director of video encoding at Facebook

David Ronca, director of video encoding at Facebook

MPEG had already identified design goals for this lower complexity encoding, which essentially are to enhance an existing codec such that it performs as well as the next generation of that technology but without any increase in computational complexity. This may sound like stating the impossible, but has been more or less achieved, partly by taking advantage of the capability modern processors have for parallel computation.

MPEG opted for the Perseus technology after being convinced it would boost compression efficiency of leading codecs. V-Nova has presented data indicating that use of LCEVC in conjunction with the venerable AVC codec reduced the bit rate required for a given video quality by around 45%, compared with AVC on its own. The saving for HEVC, representing the current generation, was 34% and for the forthcoming VVC 17%, perhaps reflecting the progress in efficiency the latter represents.

However, for computational complexity, savings achieved by LCEVC are far more impressive, and in that case greater for the more advanced codecs. That after all was the primary design goal for, as Ronca has pointed out, “the longer-term answer to video encoding cannot be to simply add more CPU capacity. This is an unsustainable model, both financially and environmentally. Codec research must emphasize both compression efficiency and computational efficiency.”

So in the tests reported by V-Nova, LCEVC in conjunction with AVC reduced encoding time by 2.4 times to 41% the level achieved by AVC alone. For HEVC the reduction was 2.7 times and for VVC 4.8 times. This will make a major impact on the degree of compression and quality that can be achieved for mobile based codecs used for generating video in the field. In fact, according to MPEG, the main targeted use cases are those requiring both live encoding and decoding, along with maximum device compatibility and high-quality video. At the same time, it must be compatible with existing ecosystems without need to upgrade or change existing hardware components.

Specific applications include live TV, multimedia streaming of sports, eSports and news under constrained OTT bandwidth, live social network mobile video, live UHD broadcast at viable digital terrestrial bandwidth, along with ability to upgrade from SD to HD, or HD to UHD, without need to replace set top-boxes.

Of course, LCEVC is not a panacea, and will face some challenges such as the one that has afflicted video compression from the beginning, cost of royalties to relevant IP (Intellectual Property) holders. It also postpones rather than eliminates the need to develop a radically different approach to video compression to meet future requirements driven by Extended Reality and holographic projection perhaps, almost certainly involving machine learning to match human perception better.

You might also like...

HDR Picture Fundamentals: Camera Technology

Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.

IP Security For Broadcasters: Part 2 - The Problem To Be Solved

By assuming that IP must be made secure, we run the risk of missing a more fundamental question that is often overlooked: why is IP so insecure?

Standards: Part 22 - Inside AIFF Files

Compared with other popular standards in use, AIFF is ancient. The core functionality was stabilized over 30 years ago and remains unchanged.

IP Security For Broadcasters: Part 1 - Psychology Of Security

As engineers and technologists, it’s easy to become bogged down in the technical solutions that maintain high levels of computer security, but the first port of call in designing any secure system should be to consider the user and t…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.