HEVC—Decoded

High Efficiency Video Coding (HEVC) is the latest and certainly an efficient compression standard targeted at today’s high pixel count images.

High Efficiency Video Coding (HEVC) is the latest, and possibly the last, in a series of ISO compression standards that began with MPEG-1 and advanced through MPEG-2 and MPEG-4. There was going to be an MPEG-3, but in the end it was incorporated into MPEG-4. In each step the coding efficiency got better and there was an increase in the size of the largest picture that could be handled, in terms of pixel count. The two tend to go together because increasing the pixel count of the source image increases the data rate into to a codec, requiring improvements in coding efficiency, also called coding gain, to keep the transmitted or recorded bit rate within practical or economic limits.

In that respect HEVC is no different, it supports picture sizes up to 8K pixels across and is claimed to double coding efficiency compared to MPEG-4. From a television standpoint, NTSC was the coding technique that made colour television possible, MPEG-2 made digital television possible, MPEG-4 made HDTV possible and now HEVC is the enabling technology for UHDTV, although it is not limited to that. A large number of colour spaces is supported, including that of Rec.2020, as well as 10-bit word length, allowing increased dynamic range to be transmitted.

HEVC can also be applied to earlier, smaller picture sizes, such as HD. Given that an existing storage or transmission infra structure already exists that determines the bit rate, the use of HEVC could be used to improve the picture quality instead. Where an existing codec is used with mild compression in a workstation-based production system, switching to HEVC at the same bit rate would give an immediate reduction in concatenation problems.

One welcome change is that HEVC does not include special coding modes to support interlace. Now that interlace has finally been laid to rest, and not before time, it is not necessary. One wonders how long it will be before 59.94Hz gets the Forest Lawn treatment.

Compression technology differences

Video compression is the ultimate paradox where the totally subjective human visual impression meets the immovable bedrock of information theory. It is as well to know which of those areas is being discussed to avoid coming unstuck. One of the fundamentals of information theory, which goes all the way back to Claude Shannon, is that for source data of given characteristics, the coding gain can only increase if either the complexity of the codec increases, if the delay increases or some combination of both. In codecs, the delay can be adjusted in use by changing parameters such as the Group size, but complexity increases disproportionately to coding gain.

The HEVC encoder is an evolution of MPEG techniques, but has added much complexity.(Image courtesy Altera.)

The HEVC encoder is an evolution of MPEG techniques, but has added much complexity.(Image courtesy Altera.)

HEVC inevitably includes an increase in complexity. For a given image size, it increases processing complexity by a factor of about 20 times. Allowing for the increased picture size supported, this means HEVC coding needs a lot of processing speed, whether in software or hardware. For that reason it is designed to allow parallel processing to be used, such that different engines can work independently on different parts of the picture in order to spread the load. The advance of microelectronics according to Moore’s law continues for the present, and this allows increased complexity and greater quantities of memory without proportionate cost increases. For portable or hand held devices, the complexity of HEVC must be realised within battery life constraints, helped by the limited screen size.

Nevertheless with HEVC we are definitely in the diminishing returns region of video coding and if one realistically tries to estimate what complexity increase would be needed to give a tangible improvement over HEVC there is a genuine possibility that it might break Moore’s Law, even if a need for larger pictures can be established.

Another interesting and fundamental characteristic of image coding is that images only contain so much information. This means that going up to 4K or 8K image sizes changes the characteristics of the input data so that we are not comparing like with like. It’s a complex subject involving buzz words like entropy that I will return to in a future piece, but basically as pixel count goes up on real TV pictures, compression gets easier. Unlike Vegas, a town built for people who don’t understand statistics, HEVC was designed by people who do, so you get lucky most of the time.

In order to improve the performance of a codec, there is no magic silver bullet that suddenly makes things a lot better. When it is considered how much intellect, creativity and manpower has been focussed on image coding over the last few decades, it is inconceivable that an obvious trick has been missed.Instead a significant increase in coding gain can only be obtained by compounding a number of small improvement and refinements, which of course, add complexity.

Intra-frame versus Inter-frame coding

Moving picture coding relies on two major mechanisms. Intra-frame, or spatial coding works on individual frames in isolation and treats them as a still picture, looking for redundancy within the frame, such as areas that are all the same brightness, or repeating textures, using transform coding. Inter-frame, or temporal coding works on the differences between frames at different places on the time axis, because much of the material one frame isn’t a whole lot different from the one before or after, especially after motion compensation is used.

Example Inter-frame prediction process. In this case, there has been an illumination change between the block at the reference frame and the block which is being encoded: this difference will be the prediction error to this block.

Example Inter-frame prediction process. In this case, there has been an illumination change between the block at the reference frame and the block which is being encoded: this difference will be the prediction error to this block.

HEVC incorporates both coding methods and has incorporated refinements to both. Clearly inter-coding cannot be used alone because if every frame depended on the one before, we could not change channel and there would be no recovery from an error. Thus the transmitted or recorded message must include Intra-coded or I-pictures where decoding can begin, or begin again after a channel change, an error or a random-access entry has been made.

Inter-coding works by prediction in the coder and decoder alike of the present frame from previously decoded frames. The prediction is then compared with the real frame to determine the prediction error, also called the residual. If the residual is transmitted fully to the decoder, it completely cancels the prediction error. It follows that coding efficiency can be increased by improving the quality of the prediction because that makes the residual smaller. Motion compensation is used to cancel out movements between frames so that they become more similar to one another. In addition to allowing the prediction unit size to change, HEVC improves the range of the motion vectors to allow for larger image size, as well as improving the quality of the predicted image using better interpolation when the image shift is not an integer number of pixels, which is usually the case.

Coder differences

Conceptually an HEVC coder doesn’t look that much different to an MPEG or earlier coder. The motion compensation is still there and it produces residual or prediction error data as before. By switching off the prediction, the prediction error becomes the input picture and you get an I picture. The residual or I data are still transform coded and the coefficients are quantized to raise the noise floor at spatial frequencies where it is invisible. It’s just that at each step where there used to a choice of this or that in earlier codecs, there is now a whole bunch of possibilities needing a whole lot of processing to figure out which one is the best.

Note the clarity of the right half of the image, which was encoded using HEVC coding running at 13Mbps,. The left, and softer,  side of the image was encoded with H.264 technology and required almost twice the bit-rate.

Note the clarity of the right half of the image, which was encoded using HEVC coding running at 13Mbps,. The left, and softer, side of the image was encoded with H.264 technology and required almost twice the bit-rate.

Possibly the greatest departure in HEVC is that the area of the picture that is used for spatial coding is variable, as is the area of the picture that can be shifted by a single motion vector, and the two areas are quite independent of one another. In MPEG-2 for example, a macroblock was the area steerable by a vector and was fixed in size at 16 pixels square and contained four coding blocks. Different combinations of motion and detail conflict with that fixed structure and HEVC allows the conflict to be resolved. Thus HEVC has coding units which vary in size from 64 x 64 pixels down to 8 x 8 pixels, and transform blocks that vary from 32 x 32 pixels down to 4 x4 pixels. Coding units can be broken down into prediction units, the area steered by a vector, independently of the transform breakdown.

Like earlier codecs, HEVC is not a one-size-fits-all system. It is broken down into Profiles, Levels and Tiers. Profiles have the same meaning as before, in that the tools needed for different purposes are specified. There are many different levels, each of which specifies a different maximum pixel count and frame rate. For each level the output bit rate is determined by the tier.

More John Watkinson technical tutorials can be found on The Broadcast Bridge:

Understanding the basics of IP Networking, Part 1

Understanding the basics of IP Networking, Part 2

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.

Building Software Defined Infrastructure: Part 1 - System Topologies

Welcome to Part 1 of Building Software Defined Infrastructure - a new multi-part content collection from Tony Orme. This series is for broadcast engineering & IT teams seeking to deepen their technical understanding of the microservices based IT technologies that are…

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…