Home Video Basics Video on FPGA About Me
Introduction Frame Prediction Rate Control Entropy coding
Video Codec Basics | Algorithms | Macroblock | YUV | Resolution

 

Video Codec Basics

Video compression algorithms ("codecs") helps to reduce the storage requirements and bandwidth requirements, while delivering good visual quality.

The most popular Video Codec is Mpeg-2, which is been used for storage in video in CD and Mpeg-4 is also popular for storage in DVD.

Video data is traditionally represented in the form of a stream of images, called frames as shown in the following figure. These frames are displayed to the user at a constant rate, called frame rate (frames/second). Commonly used frame rate is 30.

If you observe the above sequence of frames you can notice a large amount of similarly between the consecutive frames. This similarity is redundant information in terms of video storage.

The Video Compression is possible by eliminating this redundant information from the sequence of incoming video frames and there are four important types redundancies.

Temporal Redundancy
Correlation between consecutive frames in a video sequence. The following figure illustrates the correlation between consecutive frames. If we subtract/remove the Frame-1 from other frames, the remaining amount of data is significantly less. This is inter-frame redundancy removal, since we are subtracting one frame from another.

Spatial Redundancy
Correlation between adjacent image pixels within the frame.

Color Spectral Redundancy
The human eye's increased sensitivity to small differences in brightness and  decreased sensitivity to small differences in color is taken advantage here.

Physco-visual redundancy
Human Vision System is less sensitive to detailed texture in the image.

The embedded developers and researchers should have a good understanding of the various algorithms those which removes these redundancies.

H.264: Advanced Video Codec

H.264 encoder is 10 times efficient as well as complex than MPEG2 and also has many new features. The high efficiency of compression is achieved in H.264 standard is by the combination of a number of encoding algorithms.  The following tutorial of video codec will also try to cover the relevant parts of the 'H.264' and its implementation aspects.

 

 

 

Figure:  Basic Video Encoder

In this block diagram we can see five processing blocks, each one to remove the redundancies explained before.

For the removal of temporal redundancy, we have Temporal-model which will analysis the incoming frames with the previous frames to identify the similarities. In the block diagram, the Temporal-model have two inputs, one is the incoming video and the other is previous frames stored inside the memory of the codec.

Spatial-Model will remove the redundancies in the frame without comparing with other frames. This is something similar to JPG or GIF compression. For any given frame we have to execute either Temporal or Spatial compression.

The Discrete Cosine Transform along with Quantization module will remove the Physo-visual redundancy. The Entropy-coder will remove the redundancies in the final bit-stream. For example, recurring patterns in the bit-stream.

 

 

Algorithms in Video Codec

A video codec is a cocktail of algorithms, combined in the proper fashion to achieve the compression. The significant algorithms/tools used in the encoder are discussed here.

1) Intra Frame

Intra-prediction utilizes spatial correlation in each frame to reduce the amount of transmission data necessary to represent the picture. Intra frame is essentially the first frame to encode but with less amount of compression.

2) Motion Estimation

The fundamental concept in video compression is to store only incremental changes between frames. The difference between to frames are extracted by Motion Estimation tool. Here one whole frame is reduced into many sets of motion vectors.

3) Motion Compensation

Motion Compensation will decode the image that is encoded by Motion Estimation. This reconstruction of image is done from received motion vectors and the reference frame.

4) Transformation

The transformation is used to compress the image in Inter-frames or Intra-frames. The mostly used transformation is Discrete Cosine Transform (DCT) and Wavelet Transform. The codec calculates a DCT on each 4 x 4 block of pixel in frame.

5) Quantization

The quantization stage reduces the amount of information by dividing each coefficient by a particular number to reduce the quantity of possible values that value could have. Because this makes the values fall into a narrower range, this allows entropy coding to express the values more compactly.

6) De-blocking filter

Loop filtering is mandatory in the encoder, it identify a blocking situation depending by two threshold factors (alpha and beta). A lot of efficiency is due to the loop filter. The strength of filter depends on intra/inter coding, differential vectors, quantization level. Up to 40% of total processing power may be required by this kind of filter. Filtering the reference frames prior to use them in prediction can significantly improve the objective and perceptual quality expecially at low or medium bitrates.

 

 

 

7) Entropy Coder

This algorithm is a lossless encoding tool, i.e. the encoded stream can be decoded without any loss.

H.264 deployed an enhanced VLC of two types.
1) context-adaptive variable-length coding (CAVLC)
2) Context-adaptive binary-arithmetic coding (CABAC)

With the knowledge of the probabilities of syntax elements in a given context, syntax elements in the video stream can be losslessly compressed .

8) Network Abstraction Layer

All the compressed data is packetized in Network friendly format by NAL unit. A NAL unit specifies a generic format for use in both packet-oriented and bitstream systems. The format of NAL units for both packet-oriented transport and bitstream delivery is - except that each NAL unit can be preceded by a start code prefix in a bitstream-oriented transport layer.

9) Rate Distortion Optimization

The compressed bitstream will vary the size depending upon the contends of the frames. For example, a slow moving movie will generated very less compressed data, where as a fast moving movie will generate significantly large compressed data, for the same resolution & fps. This characteristics may not be welcoming in most of the situations. The rate control mechanism will keep the output bitrate within the requirement.

These techniques, along with several others, help H.264 to perform significantly better than any prior standard, under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less.

 

 

The top-level block diagram of an H.264 Encoder is shown in the figure.

Figure: H.264 Encoder Block Diagram

 

The encoding operation consists of the forward encoding path and the inverse decoding path. The forward encoding path (Red colored lines) predicts each MB using Inter or Intra prediction and Transforms and Quantizes (TQ) the residual. Then it forwards the result to the Entropy Encoder and forms output packets in the Network Abstraction Layer (NAL). So this article concentrate more towards encoder side, since the encoder has most of the components in the decoder also, except entropy decoder.

 

   

The inverse path (Blue colored lines) involves the reconstruction of the MB from the previously transformed data by utilizing the Inverse Transform and Quantization (ITQ) and the deblocking filter.

References:

http://www.newmediarepublic.com/dvideo/compression.html

http://www.cs.cf.ac.uk/Dave/Multimedia/node246.html

http://vsr.informatik.tu-chemnitz.de/~jan/MPEG/HTML/mpeg_tech.html

http://www.ee.bilkent.edu.tr/~signal/defevent/papers/cr1612.pdf

http://www.embedded.com/columns/technicalinsights/181502555?_requestid=30405

http://focus.tij.co.jp/jp/lit/wp/spry103/spry103.pdf

http://www.eurasip.org/Proceedings/Eusipco/Eusipco2007/Papers/b1l-g02.pdf

http://www2.cs.uh.edu/~openuh/hpcc07/papers/55-Sun.pdf

 

© 2007 Tony Gladvin George