Video Compression Tutorial

Video Compression Technology

At its most basic level, compression is performed when an input video stream is analyzed and information that is indiscernible to the viewer is discarded. Each event is then assigned a code – commonly occurring events are assigned few bits and rare events will have codes more bits. These steps are commonly called signal analysis, quantization and variable length encoding respectively. There are four methods for compression, discrete cosine transform (DCT), vector quantization (VQ), fractal compression, and discrete wavelet transform (DWT).

Discrete cosine transform is a lossy compression algorithm that samples an image at regular intervals, analyzes the frequency components present in the sample, and discards those frequencies which do not affect the image as the human eye perceives it. DCT is the basis of standards such as JPEG, MPEG, H.261, and H.263.

Vector quantization is a lossy compression that looks at an array of data, instead of individual values. It can then generalize what it sees, compressing redundant data, while at the same time retaining the desired object or data stream’s original intent.

Fractal compression is a form of VQ and is also a lossy compression. Compression is performed by locating self-similar sections of an image, then using a fractal algorithm to generate the sections.

Like DCT, discrete wavelet transform mathematically transforms an image into frequency components. The process is performed on the entire image, which differs from the other methods (DCT), that work on smaller pieces of the desired data. The result is a hierarchical representation of an image, where each layer represents a frequency band.

Compression Standards

MPEG stands for the Moving PictureExperts Group. MPEG is an ISO/IEC working group, established in 1988 to develop standards for digital audio and video formats. There are five MPEG standards being used or in development. Each compression standard was designed with a specific application and bit rate in mind, although MPEG compression scales well with increased bit rates. They include:

Designed for up to 1.5 Mbit/sec
Standard for the compression of moving pictures and audio. This was based on CD-ROM video applications, and is a popular standard for video on the Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the most popular standard for digital compression of audio–known as MP3. MPEG-1 is the standard of compression for VideoCD, the most popular video distribution format throughout much of Asia.

Designed for between 1.5 and 15 Mbit/sec
Standard on which Digital Television set top boxes and DVD compression is based. It is based on MPEG-1, but designed for the compression and transmission of digital broadcast television. The most significant enhancement from MPEG-1 is its ability to efficiently compress interlaced video. MPEG-2 scales well to HDTV resolution and bit rates, obviating the need for an MPEG-3.

Standard for multimedia and Web compression. MPEG-4 is based on object-based compression. Individual objects within a scene are tracked separately and compressed together to create an MPEG4 file. This results in very efficient compression that is very scalable, from low bit rates to very high. It also allows developers to control objects independently in a scene, and therefore introduce interactivity.

JPEG stands for Joint Photographic Experts Group. It is also an ISO/IEC working group, but works to build standards for continuous tone image coding. JPEG is a lossy compression technique used for full-color or gray-scale images, by exploiting the fact that the human eye will not notice small color changes.

JPEG 2000 is an initiative that will provide an image coding system using compression techniques based on the use of wavelet technology.

DV is a high-resolution digital video format used with video cameras and camcorders. The standard uses DCT to compress the pixel data and is a form of lossy compression. The resulting video stream is transferred from the recording device via FireWire (IEEE 1394), a high-speed serial bus capable of transferring data up to 50 MB/sec.

H.261 is an ITU standard designed for two-way communication over ISDN lines (video conferencing) and supports data rates which are multiples of 64Kbit/s. The algorithm is based on DCT and can be implemented in hardware or software and uses intraframe and interframe compression. H.261 supports CIF and QCIF resolutions.

H.263 is based on H.261 with enhancements that improve video quality over modems. It supports CIF, QCIF, SQCIF, 4CIF and 16CIF resolutions.

DivX Compression


Lossy compression – reduces a file by permanently eliminating certain redundant information, so that even when the file is uncompressed, only a part of the original information is still there.

ISO/IEC International Organization for Standardization – a non-governmental organization that works to promote the development of standardization to facilitate the international exchange of goods and services and spur worldwide intellectual, scientific, technological and economic activity.

International Electrotechnical Commission – international standards and assessment body for the fields of electrotechnology

Codec – A video codec is software that can compress a video source (encoding) as well as play compressed video (decompress).

CIF – Common Intermediate Format – a set of standard video formats used in videoconferencing, defined by their resolution. The original CIF is also known as Full CIF (FCIF).

QCIF – Quarter CIF (resolution 176×144)
SQCIF – Sub quarter CIF (resolution 128×96)
4CIF – 4 x CIF (resolution 704×576)
16CIF – 16 x CIF (resolution 1408×1152

Additional sources of information*

* The WAVE Report is not responsible for the content of external websites

This entry was posted in Technology, Tutorials and tagged , , . Bookmark the permalink.