Standards: Part 14 - About High Efficiency Video Coding (HEVC)

Here we look at the HEVC codec which is based on earlier work by MPEG on AVC and prior coding technologies. New techniques are employed to reduce the coded output size even further.

This article is part of our growing series on Broadcast Standards.
The first 26 articles are now available in Broadcast Standards – The Book.

Video requires much higher resolution content than H.264 was designed for. Eventually a new codec is mandated. Early trials revealed that H.264 could encode 8K video better than expected. The higher resolution allowed many more redundant macro-blocks to be eliminated.

AVC was never optimized for 8K content and it was recognized that new ideas could halve the resulting output file size and improve performance. HEVC was designed to achieve compression ratios of 1000:1 and supports resolutions up to 16K.

Relevant ISO Standards

The HEVC standards are covered by MPEG-H which is published as ISO 23008. Only 5 parts of that standard are directly relevant to HEVC. The rest describe MPEG Media Transport (MMT) and 3D Audio. ISO 23002 part 7 is also relevant.

Standard	Edition	Description
ISO 23008-2	2023	High efficiency video coding (HEVC).
ISO 23008-5	2017	Reference software for HEVC.
ISO 23008-8	2018	Conformance specification for HEVC.
ISO 23008-14	2018	Guidance on conversion and coding of High Dynamic Range (HDR) imaging.
ISO 23008-15	2018	Signaling and display adaption for High Dynamic Range (HDR) imaging.
ISO 23002-7	2024	Versatile supplemental enhancement information messages for coded video bitstreams.

About ISO 23008 Part 2

The MPEG-H Part 2 standard is just as large and complex as the preceding AVC standard. The core of HEVC has extensions for Scalable, Multiview and Stereoscopic 3D viewing support.

The document structure is similar to AVC and the same section numbering scheme has been retained. Having studied the AVC standard, much of this content will be familiar. There are important and subtle differences to take note of throughout.

The Network Abstraction Layer packets (NAL units) described in Section 7 have a slightly different format. The payload carries Profile, Tier and Level parameters which are more logically defined than they were for AVC. These are described in the annexes as before.

The latest edition was published in 2023. A new amendment (in progress) adds more profile specifications and some new SEI messages.

HEVC vs AVC Performance Factors

HEVC differs in performance and processing workload compared with AVC encoders:

Frame rates - HEVC extends the maximum to 300 fps. This is significantly higher than the nominal 60 fps supported by AVC. Note that level 6.2 in AVC is rated at 120 fps as an exception.
Image quality - Performance evaluations verify that the image quality of HEVC is superior to AVC at any given bitrate. Noise levels, color spaces and dynamic range are all improved.
Image sizes - The operating range is optimal when dealing with 8K video. The latest revisions have increased this to 16K at extremely high bitrates.
Size of coded output - HEVC was targeted to halve the bitrate of AVC coded output. This is comfortably achieved with SD content and exceeded with higher definitions.
Encoding speed - HEVC needs to do significantly more work than AVC to reduce the output size.
Decoding performance - HEVC delivers content that is easier to decode. The increased coding workload is offset by improvements in the receiving client-player performance.
CPU utilization - Increasing the coding workload requires more CPU capacity. HEVC can be parallelized more efficiently and makes better use of multiple CPU cores to spread the workload.
Interlacing - There is no support for interlaced video in HEVC. Interlaced content can be managed by coding individual fields as separate images and passing SEI messages to the player.

HEVC Advancements

These are some of the improvements that HEVC offers when compared with AVC coding:

Video coding layer - Better de-blocking and edge preserving filters.
Color spaces - More alternatives supported.
Sample sizes - Allow greater ranges of colors.
Coding Tree Units - Can be sub-divided more flexibly than macro-blocks.
Lossless - HEVC supports truly lossless coding which improves on the region constrained partial lossless coding in AVC.
Still picture support - This facilitates HEIC image compression. This is useful in situations where the video is displaying a stationary image.
Screen content - Coding support for text and graphics images. This may provide better compression for traditional 2D animation as well.
SEI - Supplemental Enhancement Information supports the delivery of additional metadata signals.

Improved Video Coding Layer

The Video Coding Layer works conventionally by splitting the first picture into blocks. Intra-prediction is used to eliminate redundant identical or similar blocks within that first picture. Subsequent pictures are assimilated, and Inter-prediction looks for redundancy by comparing new pixel rectangles with the previously processed blocks.

Note the similarity in terminology naming. Intra and Inter are different in scope. Intra works within a single frame. Inter works across several frames.

The loop filers are applied after the block prediction phase is completed. There are two loop filters in HEVC:

DBF - The DeBlocking Filter reduces artefacts at the block boundaries. This is simpler than the equivalent filter in AVC. It is designed to be easily parallelized across multiple processors as a result.
SAO - The Sample Adaptive Offset reduces sample distortions. This helps process hard edges and reduces ringing artefacts. It also helps reduce contouring artefacts.

Coding Tree Units (CTU)

AVC divides pictures into macro-blocks. These are based on 16 x 16 pixels but can be sub-divided into smaller (4 x 4) rectangles.

HEVC uses Coding Tree Units up to 64 x 64 pixels in size to improve redundancy detection. Internally, they implement a Quadtree data structure to support variable block sizes. A Quadtree is a node based nested tree structure where every node has exactly four child nodes attached to it. The Quadtree can be recursively sub-divided and nested until the block size reduces to 4 x 4 pixels. Each layer must have exactly four leaf nodes stemming from the parent node. Each branch can support a further level if that helps resolve fine detail. Large areas of similar pixels do not require deeply nested trees.

Here is an illustration showing progressively deeper Quadtree nesting. The third level is only omitted to reduce the number of examples.

The source image may not be an exact integer multiple of the Coding Tree Unit (64 x 64) pixel size. Because the origin of the image is in the top left corner, the extreme right and bottom edges of the image may not result in completely filled blocks. The encoder can work around this by partially filling a square block. The decoder can crop the additional empty pixels when the content is unpacked and rendered into the display rectangle.

HEVC can group multiple CTU blocks into tiles and slices if this helps to find redundant duplicated parts of the image.

Slices can be used in various Intra and Inter modes or clone other slices from the same or other images and compute the residual deltas from them.

Tiles can be decoded independently of the rest of the picture.
I-slices behave like inter frames in AVC.
P-slices behave like predictive frames.
B-slices behave like bi-predictive frames.

Motion vector prediction support is improved by having 33 intra prediction modes. This is significantly better than the 8 implemented in AVC. Better support for the DC coefficient in the Discrete Cosine Transform (DCT) also helps reduce the output size.

More Diverse Color Spaces

Many alternative color spaces are supported in HEVC. These are optimized for various content sources and other standards:

NTSC video
PAL video
Generic film stock
Rec 601
Rec 709
Rec 2020
Rec 2100
SMPTE 170M
SMPTE 240M
sRGB
sYCC
xvYCC
XYZ
RGB
YCbCr
YCoCg
Externally defined color spaces

Pixel Sample Sizes

HEVC supports more pixel sample sizes. The 8-bit sampling is similar to AVC, but HEVC adds 10, 12, 14 and 16-bit alternatives depending on the profile selected. This will yield more vibrant color rendition and facilitates HDR support. Monochrome images are supported natively rather than desaturating color pictures to simulate them. This improves bitrate and coding efficiency because the redundant chroma samples are never coded.

Supplemental Enhancement Information (SEI)

Metadata is passed from the encoder to the receiving client-player in these SEI messages. The player uses them to apply post-processing to the decoded images.

These are a few of the SEI messages that apply to HEVC:

Remapping color spaces from one to another.
Hints for defining transfer functions to convert from SDR (Standard Dynamic Range) to HDR (High Dynamic Range) renditions.
Support for Hybrid Log Gamma (HLG) for open-source HDR applications.
Describe the color primaries, white-point and maximum-minimum luminance to define the mastering display color volume.
Time-code values relating to the content for archival purposes.
Details of the ambient lighting environment where the video was authored.
Support for 3D displays.

ISO standard 23002 part 7 (2024) describes Supplemental Enhancement Information (SEI) messages and Video Usability Information (VUI) parameters. It is cross-referenced from video coding standards to avoid repetition therein. This standard is particularly relevant to the Versatile Video Coding (VVC) video format.

Profiles

Profile selection determines how the encoder operates and selects a sub-set of the available coding tools. This affects the coding efficiency and size of the output bitstream. Profile and level signaling is less complex than H.264.

HEVC profiles are grouped in a similar way to the AVC profiles but can be gathered into different categories. The core profiles are extended in different ways according to the support you need:

Category	Description
Core	The foundation set of HEVC profiles.
Rext	Format Range Extension profiles. These add different bit depths, monochrome versions and Intra coding formats.
High	High throughput coding formats. Intended for situations where a very high bitrate is needed for professional content processing.
SCC	Screen Content Coding Extensions to support imaging of text and graphics content.
SHVC	Scalable Video Coding enhancements to the core HEVC coding specification described in Annex H of the standard.
MV-HEVC	Stereoscopic and Multiview support described in Annex G of the standard.
3D-HEVC	3D imaging support described in Annex I of the standard.

The HEVC profile names are more descriptive than AVC profiles. Color sampling and bit depths are both clearly indicated.

Color sampling values range from 8 to 16-bits depending on the profile. Some profile names include an integer value to indicate the number of color-sampling bits. Assume that the sample size is 8-bits unless it is specified otherwise.

Chroma sampling formats are also implied by the profile name. Assume the default 4:2:0 format unless the profile describes a monochrome picture in which case use 4:0:0. Profiles operating in 4:2:2 and 4:4:4 mode will have the appropriate description embedded in their name. Profiles using 4:4:4 chroma sampling can also support delivery of 4:0:0, 4:2:0 and 4:2:2 content.

Category	Edition	Profile Name
Core	1	Main
Core	1	Main 10
Core	1	Main Still Picture
High	2	High Throughput 4:4:4 16 Intra
Rext	2	Main 10 Intra
Rext	2	Main 12
Rext	2	Main 12 Intra
Rext	2	Main 4:2:2 10
Rext	2	Main 4:2:2 10 Intra
Rext	2	Main 4:2:2 12
Rext	2	Main 4:2:2 12 Intra
Rext	2	Main 4:4:4
Rext	2	Main 4:4:4 10
Rext	2	Main 4:4:4 10 Intra
Rext	2	Main 4:4:4 12
Rext	2	Main 4:4:4 12 Intra
Rext	2	Main 4:4:4 16 Intra
Rext	2	Main 4:4:4 16 Still Picture
Rext	2	Main 4:4:4 Intra
Rext	2	Main 4:4:4 Still Picture
Rext	2	Main Intra
Rext	2	Monochrome
Rext	2	Monochrome 12
Rext	2	Monochrome 12 Intra
Rext	2	Monochrome 16
Rext	2	Monochrome 16 Intra
MV-HEVC	2	Multiview Main
SHVC	2	Scalable Main
SHVC	2	Scalable Main 10
3D-HEVC	3	3D Main
High	4	High Throughput 4:4:4
High	4	High Throughput 4:4:4 10
High	4	High Throughput 4:4:4 14
SHVC	4	Scalable Main 4:4:4
SHVC	4	Scalable Monochrome
SHVC	4	Scalable Monochrome 12
SHVC	4	Scalable Monochrome 16
High/SCC	4	Screen-Extended High Throughput 4:4:4
High/SCC	4	Screen-Extended High Throughput 4:4:4 10
High/SCC	4	Screen-Extended High Throughput 4:4:4 14
SCC	4	Screen-Extended Main
SCC	4	Screen-Extended Main 10
SCC	4	Screen-Extended Main 4:4:4
SCC	4	Screen-Extended Main 4:4:4 10
Core	5	Main 10 Still Picture
Rext	5	Monochrome 10
MV-HEVC	9-Amd	Multiview Main 10
MV-HEVC	9-Amd	Multiview Monochrome
MV-HEVC	9-Amd	Multiview Monochrome 10
MV-HEVC	9-Amd	Multiview Monochrome 12

Note that profiles can simultaneously belong to the High Throughput and Screen Content Coding Extension categories.

Tiers

HEVC introduces two Tiers of operation which are related to the Profiles and the corresponding Levels:

Main tier - Designed for most applications, this tier is available across all levels but constrains the maximum bitrate to a much lower level than the High tier. Optimized for SD picture sizes and smaller. It is suitable for most consumer applications.
High tier - Cannot be used for levels 1 to 3. This is designed for HD and higher picture sizes with levels 4 to 7 and where the application demands better performance.

Levels

Levels describe the picture sizes in the receiving client-player and are similar to AVC levels. An additional level (7) increases the scope to support 16K images. The level signaling is much simpler in HEVC than it was in AVC.

These are the main level groups and picture sizes:

Level Grouping	Description
1	Small pictures for older mobile devices.
2	Quarter SD frame size or low frame-rate SD.
3	SD and some 1280 HD formats.
4	2K.
5	4K.
6	8K.
7	16K.

The encoder trades off frame-rates and picture sizes to remain within a given bitrate defined by the level. This affects the decoding speed and picture buffering limits. Maximum values for picture size and frame rates for the different levels are listed here as examples. Consult the standard for more detail:

Level	Picture Size	FPS
1	176 × 144	15.0
2	352 × 288	30.0
2.1	640 × 360	30.0
3	960 × 540	30.0
3.1	1280 × 720	33.7
4	2048 × 1080	30.0
4.1	2048 × 1080	60.0
5	4096 × 2160	30.0
5.1	4096 × 2160	60.0
5.2	4096 × 2160	120.
6	8192 × 4320	30.0
6.1	8192 × 4320	60.0
6.2	8192 × 4320	120.0
6.3	12288 x 6480	60.0
7	16384 x 8640	34.0
7.1	16384 x 8640	60.4
7.2	16384 x 8640	120.8

Levels 6.3 to 7.2 were defined in a later edition of the standard. They introduce the 12K and 16K sizes for use with future display technologies beyond the current 8K formats.

Container Files

Most of the containers that were compatible with AVC can be used for storing HEVC content with a couple of exceptions. They are summarized here:

Container Type	File ext	AVC	HEVC
Material Exchange Format	mxf	Yes	Yes
MPEG Program Stream	mpg, mpeg	Yes	Yes
MPEG Transport Stream	ts	Yes	Yes
Third Generation Partnership Project (3GPP)	3gp	Yes	Yes
Matroška file format	mkv	Yes	Yes
MPEG-4 Part 14	mp4	Yes	Yes
QuickTime File Format (QTFF)	mov	Yes	Yes
Advanced Systems Format (ASF)	asf	Yes	Yes
Audio Video Interleave	avi	Yes	Yes
MPEG-2 Transport Stream used on Blu-ray discs	m2ts	Yes	No
Enhanced Video Object files for HD DVD discs	evo	Yes	No
Flash MP4 video file based on ISOBMFF	f4v	Yes	No

Market Penetration

AVC is a widely used format and is very popular now. HEVC is a format for the future. As technology platforms advance, HEVC might achieve the same market penetration eventually. It must offer sufficient advantages to offset the additional cost of coding systems and patent licensing fees to succeed.

The latest versions of all the major web browsers support HEVC playback in the HTML5 <video> tag. This facilitates the deployment of web-based video players.

The 3D-HEVC support has been adopted by the Apple Vision Pro headset. That significantly enhances the reputation of HEVC.

Patent Issues

Patent licensing did significant damage to the prospects of MPEG-4 interactive content 20 years ago. History is repeating itself with HEVC which is struggling to gain traction, mainly due to the costs of patent licenses.

AVC has a single patent licensing pool but there are four with HEVC and some patent holders are going it alone as well. Patent licensing fees for HEVC are significantly higher than they were for AVC.

In the fullness of time, all patents will naturally expire. MPEG-2 is essentially patent free other than in Malaysia. New codec designs can use that technology to create royalty free alternatives. MPEG4 part 2 is also now patent free. AVC is expected to be patent free by 2030. Since HEVC is partly based on earlier codecs, it is likely that some of those expiring patents are relevant. It remains to be seen whether that affects the patent licensing fees.

Large tech companies, where the bulk of any potential patent revenues might have come from, established the Alliance for Open Media which has developed a royalty free alternative to HEVC. The AV1 codec is open-source and freely available to anyone to use. It is significant that Apple has built AV1 decoding tools into its proprietary M3 Apple Silicon CPU chip designs.

Conclusion

Caveats regarding profile and level compatibility apply to HEVC as they did to AVC. The encoder and client-player must both support the same configuration.

HEVC has gained considerable credibility by being adopted by Apple for use in the Vision Pro VR headset. This can only be good for the future of HEVC.

We may still need yet another cycle of codec innovation. HEVC is good for 8K and is scoped to support 16K at the upper end of its designed performance range. Research work and prototypes are already in hand for 32K video systems. We thought that 8K might be too extreme for consumers to have at home, but displays are now affordable. Will 16 and 32K be feasible? If they are, we need even better compression technologies. There are no displays of that size yet, but there are several prototype camera designs in the pipeline.

These Appendix articles contain additional information you may find useful:

Part of a series supported by

You might also like...

Building Software Defined Infrastructure: Monitoring Microservices

Breaking production systems into individual microservice based processors, requires monitoring over IP via RESTful APIs and a database system to capture the results.

Monitoring & Compliance In Broadcast: Monitoring QoS & QoE To Power Monetization

Measuring Quality of Experience (QoE) as perceived by viewers has become critical for monetization both from targeted advertising and direct content consumption.

IP Monitoring & Diagnostics With Command Line Tools: Part 5 - Using Shell Scripts

Shell scripts enable you to edit your diagnostic and monitoring commands into a script file so they can be repeated without needing to type them manually every time. Shell scripts also offer some unique and powerful features that help to…

Building Software Defined Infrastructure: Observability In Microservice Architecture

Building dynamic microservices based infrastructure introduces the potential for variable latency which brings new monitoring challenges that require an understanding of observability.

Broadcast Standards: Kubernetes & The Architecture Of Cloud Compute Based Systems

Here we describe Kubernetes and the taxonomy of containerized architecture based cloud compute system designs it manages.