Compression: Part 12 - The Evolution Of Video Compression
Having considered all of the vital elements of moving image coding this final part looks at how these elements were combined throughout coding history.
Other articles in this series and other series by the same author:
Compression has a long history because the need has always been there. The need is simple, which is that storage and transmission traditionally had limited data rates and that large amounts of data had cost implications.
In trying to decide where to start, the history of MPEG is probably the most relevant, because MPEG did more than design a codec set in stone by a standard. Instead, it developed a standardized framework in which codecs could improve by standardizing the protocol by which encoder and decoder communicated, rather than detailing how the compression was performed.
That has turned out to be a highly successful approach that led to the widespread adoption of video compression that in turn became an enabling technology for transmission of pictures via discs, digital broadcasting and across networks.
Arguably the first practicable video codec standard was H.261 that appeared in 1988 and it formed the basis of MPEG 1 that appeared in 1991. Video compression is thus about 35 years old and has gone from something that would just about do to something that is taken for granted.
In that time, the nature of the available electronic and storage technology and the nature of markets have changed out of all recognition and the various MPEG standards that have evolved out of necessity have reflected that.
Thirty-five years ago the worlds of computers, audio, television and cinema were quite separate. Cinema used film, audio was distributed on dedicated media such as Compact Disc and broadcasting was based on analog formats offering 500 to 600 lines, which were not thought to be enough, with a few digital islands such as time base correctors, effects units and graphic generators. The serial digital interface (SDI) would not appear until 1989.
In comparison, today things are unrecognizable. Once all types of information could be digitized, the result was data and the problem was to store, transmit and process data. Dedicated media disappeared. Today there are no new audio or video media, unless we expand the definition of formatting to show how software, audio, still pictures and movies are stored with equal ease on flash memories and hard drives whose capacity just keeps increasing.
At the same time microelectronics continues to advance and algorithms of increasing complexity can be handled without penalties of cost or power consumption.
Today, broadcasting and compression alike are mature technologies where, I believe, all of the important developments have been realized and further progress is subject to diminishing returns. Complexity rises disproportionately with compression factor if quality is not to be lost. At the same time the increased availability of bandwidth means the need for compression is not increasing.
Another important factor is that today the traditional broadcast delivery from transmitters is facing strong competition from data delivered via networks, again using compression.
Cinema no longer uses film. Based on silver, the cost of film was becoming a problem and being a physical medium, it could be pirated. The adoption of digital techniques solved both problems as data could be encrypted to prevent piracy and the same data could easily be sent to more than one cinema.
Whilst it was universally agreed that 500 to 600 lines of resolution was not enough, much of the problem was due to the universal use of interlace and early codecs had to work with interlaced formats, even though such an arrangement was sub-optimal. Interlace clung on desperately for no good technical reasons, just as 24Hz clings on today in cinema even though modern digital cinemas can show higher frame rates.
Today interlace is dead and good riddance. Instead of formats that have insufficient lines, we now have formats that have too many. Whilst it could confidently be said in connection with standard definition that most of the redundancy was temporal and the coding gain came from motion compensation, as pixel counts have gone up that may no longer be true.
The reason is that the low frame rates of television require the camera to have a long exposure time to de-focus motion, or else the result is judder. This de-focusing limits information capacity and takes place irrespective of pixel count. As a result, high pixel counts, such as 4K and 8K formats do not actually contain any more information than smaller formats. Effectively they are oversampled, and the result is that they contain more spatial redundancy than smaller formats and modern codecs explore that.
This led to the adoption of techniques such as spatial prediction in later formats such as H.264. It may have had something to do with the use of intra-only coding in digital cinema, although the avoidance of royalties on inter-coding may also have been relevant. Data sent to cinemas is stored locally and does not need to be sent in real time. High compression factors are not needed.
MPEG 1 introduced most of the key elements of video coding, including motion compensation and bidirectional coding, although it did not support interlace and the only color format was 4:2:0. Whilst the Compact Disc was launched as an audio format, it was essentially a data medium and was soon adopted to create the highly successful CDROM. With the help of MPEG-1, CD Video was created. This allowed video and audio within the bit rate of a regular CD, about 1.5 megabits per second. To achieve that low bit rate with acceptable artefacts the pictures were down sampled horizontally and vertically and the players up-converted them to a TV standard.
MPEG 2, arriving in 1995/6 sought to increase the applicability of the basic MPEG 1 coding ideas by supporting interlace, different color formats and by introducing the concept of profiles and levels. High compression factors are really only useful for final delivery of material that is fully produced. Compression is less useful for production purposes as it introduces generation loss as well as making editing more difficult. Production codecs need to use higher bit rates and short groups.
The work that would have led to MPEG 3 was completed before MPEG 2 was announced and was incorporated into MPEG 2, so there is no MPEG 3.
In MPEG 2, a Profile corresponds to the complexity involved, which translates into hardware cost and signal delay, whereas a Level corresponds to the maximum pixel count supported. Not all combinations were available.
For the broadcaster the most meaningful Levels were the Main Level that supported standard definition and the High Level that supported HDTV. Both of these Levels were available in the Main Profile. As a result, MPEG 2 became very popular and was used in DVD, which was a higher density version of the CD and in digital video broadcasting. DVD was available from the mid 1990’s.
MPEG 2 also supported scalable formats, where a base channel delivered pictures of a certain quality and an optional subsidiary channel could be used by a suitable decoder to increase the quality. Although clever, these found little use.
HDTV only became practicable following development of the flat screen display, which eclipsed the CRT and allowed screens of impressive size and pixel count. However, at the same time the ubiquity of the cellular telephone and the incorporation of good quality displays meant that increasingly people would watch “television” on their phone, despite the small picture. HDTV is clearly not needed for a cell phone display.
Around 2006 the Blu-ray disc became available. Use of a shorter wavelength laser had further increased the storage density and the demands on the codec went beyond what MPEG 2 could support.
Enter MPEG 4 which was a whole new world, because not only did it allow video to be compressed, but it also supported rendered or virtual images. The most relevant aspect of MPEG 4 for TV and discs was Part 10, which came to be known as H 264 or Advanced Video Coding (AVC). AVC supported picture size up to 8K yet where comparison was possible, it offered the same quality as MPEG 2 at about half the bit rate on account of its greater complexity. AVC is widely used in Blu-ray and HDTV.
AVC improves its motion compensation by allowing the use of smaller picture blocks so that the outline of moving objects can be described with less error. Spatial coding is improved by the adoption of spatial prediction.
In 2013 along came H 265 or HEVC (High Efficiency Video Coding) that is basically a refinement of H 264. It makes the encoder more complex, but not the decoder. Such asymmetry is ideal for broadcasting.
As I see it, that’s about where we stand today. The technical challenges have all been met and there is little left to do. The story of error correction is about the same, short and spectacular. Today the challenges we face are not technological, but don’t get me going on that one…..
You might also like...
Designing IP Broadcast Systems - The Book
Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…
An Introduction To Network Observability
The more complex and intricate IP networks and cloud infrastructures become, the greater the potential for unwelcome dynamics in the system, and the greater the need for rich, reliable, real-time data about performance and error rates.
2024 BEITC Update: ATSC 3.0 Broadcast Positioning Systems
Move over, WWV and GPS. New information about Broadcast Positioning Systems presented at BEITC 2024 provides insight into work on a crucial, common view OTA, highly precision, public time reference that ATSC 3.0 broadcasters can easily provide.
Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G
The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.
Designing IP Broadcast Systems: Addressing & Packet Delivery
How layer-3 and layer-2 addresses work together to deliver data link layer packets and frames across networks to improve efficiency and reduce congestion.