Understanding Compression Technology: Predicted Frames and Difference Frames. Part 3.
![Image courtesy of XIPH.ORG](/cache/uploads/content_images/motion-vector-image_789_351_70_s.jpg)
Image courtesy of XIPH.ORG
In Part 2 of this series on Compression Technology we learned how Motion Vectors are generated when motion estimation is employed as the first step of creating P-frames and B-frames. In Part 3 we’ll learn how these motion vectors are used to generate Predicted Frames.
Let’s review the nature of P- and B-frames by first looking at forward dependencies. Two types of frames serve as references for other frames: an I-frame can support a future P-frame and/or B-frame. And, a P-frame can support a future P-frame and/or B-frame. Put a different way, a P-frame and/or a B-frame can be dependent on a previous I-frame or a previous P-frame. Arrows that point leftward in the Closed GOP diagram below show such dependencies.
![](/cache/uploads/content_images/I-P-B-Frames-diagram_454_109_70.jpg)
Dependencies among I-, P-, and B-Frames (Apple).
Video frames that will become P- and B-frames are partitioned into macroblocks in the same way as is done for an I-frame. Starting with the first macroblock in the Present Image (current video frame) a search is made to determine where it’s content can be found in the Adjacent Image (next video frame). When the contents of a macroblock have not moved, the macroblock’s motion vector is set to zero.
When a match is not found at X=0 and Y=0, the Present Image’s comparison macroblock is moved at an increasing distance from its origin until there is a match—or ultimately no match. Once a search is made for the first macroblock, additional searches are made for every macroblock within the Present Image. In this way every macroblock within a Present Image is assigned a motion vector. A Present Image’s motion vectors are stored in a Motion Estimation Block. Although an estimation block will ultimately be stored, it is first used to generate a Predicted Frame.
Below, the upper-left image is the Present Image. The upper-right image is the Adjacent Image (next video frame). One difference that has occurred between the capture of the Present Image and the capture of the Adjacent Image is obvious – the person has opened their eyes.
![Steps to Generate a Predicted Frame (Wang).](https://www.thebroadcastbridge.com/uploads/content_images/_Medium/Compression-technology-examples.jpg)
Steps to Generate a Predicted Frame (Wang).
Steps to Generate aPredicted Frame (Wang)
The lower-left image is the Adjacent Image with the calculated motion vectors superimposed. These motion vectors are applied to the Present Image (current video frame) to construct a Predicted Frame. Simply put, the vectors move macroblocks in the Present Image to new locations. The lower-right image shows the generated Predicted Frame.
Ideally, these vectors would move pixels exactly to their new locations. However, as shown, the Predicted Frame has errors. To eliminate motion estimation errors, a Difference Frame is created.
A Difference Frame is generated by subtracting the Adjacent Image (current video frame) from the Predicted Frame (next video frame). Were the motion vectors able to create a perfect Predicted Frame, the Predicted Frame would match the Adjacent Image and the Difference Frame would be empty. With motion video, likely there will be information in the Difference Frame – as shown below.
![Difference Frame (Wang).](https://www.thebroadcastbridge.com/uploads/content_images/_Medium/Compression-technology-difference-frame.jpg)
Difference Frame (Wang).
The Difference Frame is compressed (DCT) after which lossless data reduction is applied (VLC and RLC). This is the same process used to compress an I-frame. The motion estimation blocks are also VLC and RLC compressed. The compressed Difference Frame along with the compressed motion estimation block are then stored.
To summarize the compression process; each I-frame is intra-frame compressed and stored in a long-GOP stream. Each compressed P-frame includes two types of information: a motion estimation block and a Difference Frame. (Each compressed B-frame has two motion estimation blocks and two Difference Frames.)
As a stream is uncompressed, an I-frame is re-created by reversing its lossless compression and then performing an Inverse DCT. This yields a Present Image that is output as a video picture. (A Present Image can be obtained from a previous I- or P-frame.) When a P-frame is encountered in a long-GOP stream, its motion estimation block is uncompressed. These vectors are applied to the Present Image to create a Predicted Frame.
Next, the P-frame’s Difference Frame is re-created by reversing its lossless compression and then performing an Inverse DCT. With both a Predicted Frame and Difference Frame available, an Adjacent Image – output as a video picture – is generated by using the Difference Frame to correct errors in the Predicted Frame. (A B-frame’s single Adjacent Image is obtained from a previous I- or P-frame by appropriately employing two Difference Frames to correct errors in two Predicted Frames.)
This process is repeated for the remaining frames in each GOP. When the next I-frame is encountered, the process is repeated. Although P- and B-frames are more efficient – require less stored data – than are I-frames, because of the use of Difference Frames they have the same visual quality.
You might also like...
IP Security For Broadcasters: Part 12 - Zero Trust
As users working from home are no longer limited to their working environment by the concept of a physical location, and infrastructures are moving more and more to the cloud-hybrid approach, the outdated concept of perimeter security is moving aside…
Disruptive Future Technologies For HDR & WCG
Consumer demands and innovations in display technology might change things for the future but it is standardization which perhaps holds the most potential for benefit to broadcasters.
Microphones: Part 6 - Omnidirectional Response In Practice
Having looked at how microphones are supposed to work, here we see that what happens in practice isn’t quite the same because the ideal and the actual are somewhat different.
IP Security For Broadcasters: Part 11 - EBU R143 Security Recommendations
EBU R143 formalizes security practices for both broadcasters and vendors. This comprehensive list should be at the forefront of every broadcaster’s and vendor’s thoughts when designing and implementing IP media facilities.
Live Sports Production: The Rise Of Remote Hybrid Workflows
A discussion of the rise of remote production, why OB workflows remain first choice in tier one production and the emergence of new hybrid workflows.