Home Video Basics Video on FPGA About Me
Introduction Frame Prediction Rate Control Entropy coding
Prediction | I frame | P frame | B frame

 

Frame Prediction

There are two important kinds of frames in Video Coding: I (intra) frames and P (predicted) frames. P-frames contributes significantly towards the high compression ratios. Depends upon the type of prediction, one more frame predication method is also widely used, named B (bidirectional) frames.

Intra frames do not refer to other frames, making them suitable as key frames. They are, essentially, self-contained compressed images. Consider the following as a film strip.

 

Intra Frame (I-Frame)
This First frame will be coded independently.

 

 

Predicted Frame (P-Frame)
Second frame will be code only with differences from the first frame

 

 

Predicted Frame (P-Frame)
Third frame will be code only with differences from the second frame

 

Predicted Frame (P-Frame)
Fourth frame will be code only with differences from the third frame

 

 

In this film strip, the first frame is an I-frame. The following frames are predicted from the first-frame and hence these frames are predicted frames and is known as P-frames.

 

 

I frame

Intra frame is essentially the first frame to encode but with less amount of compression. This frame is also known as ‘key frame’ because the preceding frames are encoded using the information available from this frame. Intra-prediction utilizes spatial correlation in each frame to reduce the amount of transmission data necessary to represent the picture.

Intra-frame is more or less similar to image compression like JPEG or GIF. They is coded without any dependencies to other frames. This type of frame is in which a complete image is stored in the data stream. Thus an Intra-frame can be decoded of its own without referring other frames. This kind of frames are formed using Intra-Predication methods, which are discussed below.

H.264 view:-

H.264 performs intra-prediction on two different sized blocks: 16x16 (the entire macroblock) and 4x4. 16x16 prediction is generally chosen for areas of the picture that are smooth. 4x4 prediction, on the other hand, is useful for predicting more detailed sections of the frame. In the following picture some of the locations are pointed out, where we can use 16x16 and 4x4.

The general idea is to predict a block, whether it be a 4x4 or 16x16 block, based on surrounding pixels using a mode that results in a prediction that most closely resembles the actual pixels in that block.

 

     

P-frame

Inter prediction

P-frames are predicted by using the previous P or I-frame. The frames 2 to 4 are Predicted frames.

 

The Inter frames are encoded from the second frame onwards from the incoming frames. This type of frames is responsible for the most reduction of the video stream. This is possible by extracting only the motion information from frames.

Motion estimation

The motion estimation algorithms are used in the encoding of Inter frames.H.264 encoding supports sub-pixel resolution formotion vectors, meaning that the reference block is actually calculated by interpolating inside a block of real pixels. The motion vectors for luma blocks are expressed at quarter-pixel resolution, and for chroma blocks the accuracy can be eighth-pixel accuracy.

 

 

B-frame

B-frames are bidirectional predicted frames. As the name suggests, B-frames rely on the frames preceding and following them. B-frames contain only the data that have changed from the preceding frame or are different from the data in the very next frame. The following figure shows frame number 2 and 3 are B-frames.

B frames are interesting for two facts. First they have a slightly better prediction. And second and more important, they do not impact the quality of following frames, so they can be coded with lower quality without degrading the whole sequence.

Since B-frames depend on both past and future picture, the decoder have to be fed with future I-P frames before being able to decode them.

 

 

and the conclusion is..

I frames are the least compressible but don't require other video frames to decode. P frames can use data from previous I frames to decompress and are more compressible than I frames. B frames can use both previous and forward frames for data reference to get the highest amount of data compression.

 

More details for Intra prediction

Intra-Prediction modes

There are nine 4x4 prediction modes, shown in following Figure, and four 16x16 modes. The four 16x16 modes are similar to modes 0, 1, 2, and a combination of modes 3 and 8 of the 4x4 modes.

 

 

   

More details for Frame prediction

Search strategies

There are many kinds of algorithms used motion estimation.

 

Still under construction : Download the PPT here

 

 

© 2007 Tony Gladvin George