Saturday, 31 December 2016

Digital Television Video Codecs


Motion Picture Standards and Compression Techniques


Here is a list of the different video coding standards:

Ø  MPEG-1: Is the standard of audio and video compression. Provides video at a resolution of 350x240 at 30 frames per second. This produces video quality slightly below the quality of conventional VCR videos. Includes audio compression format of Layer 3 (MP3).

Ø  MPEG-2: audio and video standard for broadcast of television quality. Offers resolutions of 720x480 and 1280x720 at 60fps with audio CD quality. Matches most of TV standards even HDTV. The principal use is for DVDs, satellite TV services and digital TV signals by cable. An MPEG-2 compression is able to reduce a 2 hour video to few gigabytes. While decompressing a MPEG-2 data stream no needs much computer resources, the encoding to MPEG-2 requires more energy to the process.

Ø  MPEG-3: Designed for HDTV but was replaced for MPEG-2

Ø  MPEG-4: Standard algorithm for graphics and video compression based on MPEG-1, MPEG-2 and Apple QuickTime technology. The MPEG-4 files are smaller than JPEG or QuickTime, therefore are designed to transfer video and images through a narrow bandwidth and sustain different mixtures of video with text, graphics and 2D or 3D animation layers.

Ø  MPEG – 7: Formally called Multimedia Content Description Interface, supplies a set of tools for multimedia content. Performed to be generic and not aimed at a specific use.

Ø  MPEG – 21: Allow a Rights Expression Language (REL) and Rights Data Dictionary. Describes a standard that defines the description of the content and the processes to access, search, store and protect the copyright of the content discordant with other MPEG standards that define compression coding methods. The above-mentioned are the standard but each one has specific parts depending on the use.



Among these types the most important contemporaneously are:

Ø  MPEG-2

Ø  MPEG-4 →. Technologically called MPEG-4 H.264 / AVC.





MPEG-2 (H.262)




                  

MPEG-2 is a standard for “the generic coding of moving pictures and associated audio information”. Is an extension of the MPEG-1 international standard for digital compression of audio and video signals created to broadcast formats at higher bit rates than MPEG-1. Initially developed to serve the transmission of compressed television programs via broadcast, cablecast, and satellite, and subsequently adopted for DVD production and for some online delivery systems, defines a combination of lossy video compression and lossy audio data compression using the actual methods of storage, like DVDs or Blu-Ray, without a bandwidth restriction.

The main characteristics are:

Ø  New prediction modes of fields and frames for interlaced scanning.

Ø  Improved quantification.

Ø  The MPEG-2 transport stream permits the multiplexing of multiple programs.

Ø  New intra-code variable length frame (VLC). Is a code in which the number of bits used in a frame depends on the probability of it. More frame probability implies more bits intended by frame. Strong support for increased errors.

Ø  Uses the discrete cosine transform algorithm and motion compensation techniques to compression.

Ø  Provides for multichannel surround sound coding. MPEG-2 contains different standard parts to suit to the different needs. Also annexes various levels and profiles.



MPEG-2 FUNDAMENTALS

         

                   Nowadays, a TV camera can generate 25 pictures per second, i.e., a frame rate of 25Hz. But in order to convert it to a digital television is necessary to digitalize the pictures in order to be processed with a computer. An image is divided in two different signals: luminance (Y) and chrominance (UV). Each image has one luma number and two chrominance components. The television colour signal Red-Green-Blue (RGB) can be represented with luma and chrominance numbers. Chrominance bandwidth can be reduced relative to the luminance signal without an influence on the picture quality.



An image can also be defined with a special notation (4:2:2, 4:2:0). These are types of chroma sub-sampling relevant to the compression of an image, storing more luminance details than colour details. The first number refers to the luminance part of the signal, the second refers to the chroma. In 4:2:2 luminance is sampled 4 times while the chroma values are sampled twice at the same rate. Being a fact that the human eye is more sensitive to brightness than colour, chroma is sampled less than luminance without any variation for the human perception. Those signals are also partitioned in Macro blocks which are the basic unit within an image. A macro block is formed by more blocks of pixels. Depending on the codec, the block will be bigger or smaller. Normally the size is a multiple of 4. MPEG-2 coding creates data flow by three different frames: intra-coded frames (I frames), predictive-coded frames (P-frames), and bidirectional-predictive-coded frames (B-frames) called “GOP structure” (Group of Pictures structure).

Ø  I-frame: Coded pictures without reference to others. Is compressed directly from a original frame.

Ø  P-frame: Uses the previous I-frame or P-frame for motion compensation. Each block can be predicted or intra-coded.

Ø  B-frame: Uses the previous I or P picture and offers the highest compression. One block in a B-picture can be predicted or intra-coded in a forward, backward or bidirectional way. A typical GOP structure could be: B1 B2 I3 B4 B5 P6 B7 B8 P9 B10 B11 P12. I-frames codes spatial redundancy while B-frames and P-frames code temporal redundancy. MPEG-2 also provides interlaced scanning which is a method of checking an image. The aim is to increase the bandwidth and to erase the flickering showing the double quantity of images per second with a half frame rate. For example, produce 50 images per second with a frame rate of 25Hz. The scan divides a video frame in two fields, separating the horizontal lines in odd lines and even lines. It enhances motion perception to the viewer. Depending on the number of lines and the frame rate, are divided in:

Ø  PAL / SECAM: 25 frames per second, 625 lines per frame. Used in Europe.

Ø  NTSC: 30 frames per second, 525 lines per frame. Used in North America.

MPEG-2 encoding is organized into profiles. A profile is a "defined subset of the syntax of the specification". Each profile defines a range of settings for different encoder options. As most of settings are not available and useful in all profiles, these are designated to suit with the consumer requirements. A computer will need a hardware specific for the use, the same with a television or a mobile phone, but it would be capable to rate it in a particular profile. Then an encoder is needed to finish the compression.





MPEG-2 COMPRESSION BASICS




Spatial Redundancy:

                   A technical compression type which consists of grouping the pixels with similar properties to minimize the duplication of data in each frame.

Involves an analysis of a picture to select and suppress the redundant information, for instance, removing the frequencies that the human cannot percept. To achieve this is employed a mathematical tool: Discrete Cosine Transform (DCT).





Intra Frame DCT Coding:

                   The Discrete cosine Transform (DCT) is a based transform with Fourier Discrete Transform with many applications to the science and the engineering but basically is applied on image compression algorithms. DCT is employed to decrease the special redundancy of the signals. This function has a good energy compaction property and so on accumulates most of the information in few transformed coefficients. In consideration of this the signal is converted to an new domain, in which only a little number of coefficients contain most of the information meanwhile the rest has got unappreciated values. In the new domain, the signal will have a much more compact representation, and may be represented mainly by a few transform coefficients. It is independent of the data. The algorithm is the same, regardless of the data applied in the algorithm. It is a lossless compression technique (negligible loss).The DCT is capable to interpret the coefficients in a frequency point. As a result of that, it can take a maximum of compression capacity profit. The result of applying DCT is an 8x8 array composed of distinct values divided in frequencies:

Low frequency implies more sensitive elements for the human eye.

High frequency means less cognizant components.



Temporal Redundancy:

                   Temporal compression is achieved having a view in a succession of pictures.

Situation: An object moves across a picture without movement. The picture has all the information required until the movement and is not necessary to encode again the picture until the alteration. Thereafter, is not necessary to encode again all the picture but only the part that contains the movement owing that the rest of the scene is not affected by the moving object because is the same scene as the initial picture. The notation with is determined how much movement is contained between two successive pictures is motion compensated prediction.

As a result of isolating a picture is not a good fact because probably an image is going to be constructed from the prediction from a previous picture or maybe the picture may be useful to create the next picture.







Motion Compensated Prediction:

                   Identify the displacement of a given macro block in the current frame respect from the position it had in the frame of reference.

The steps are:

Ø  Search for the same macro blocks of the frame to be encoded in the frame of

reference.

Ø  If there is not the same macro block then the corresponding motion vector is

encoded.

Ø  The more similar macro block (INTER) is chosen and later on is necessary to

encode the motion vector.

Ø  If there is no similar block (INTRA) these block is encoded using only the spatial             redundancy.



H.264 / MPEG-4 AVC


                  

H.264 or MPEG-4 part 10 defines a high-quality video codec compression developed by the Video Coding Expert Group (VCEG) and the Motion Picture Experts Group (MPEG) in order to create a standard capable of providing good quality image, but using rates actually lower than in previous video standards such as MPEG-2 and without increasing the complexity of its design, since otherwise it would be impractical and expensive to implement.  A goal that is proposed by its creators was to increase its scope, i.e., allow the standard to be used in a wide variety of networks and video, both high and low resolution, DVD storage, etc.





In December 2001 came the Joint Video Team (JVT) consisting of experts from VCEG and MPEG, and developed this standard to be finalized in 2003. The ISO / IEC (International Organization for Standardization / International Electro technical Commission) and ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) joined this project. The first is responsible of the rules for standards by focusing on manufacturing and the second focuses mainly on tariff issues. The latter planned to adopt the standard under the name of ITU-T H.264 and ISO / IEC wanted to name him MPEG-4 Part 10 Advanced Video Codec (AVC), hence the name of the standard. To set the first code they firstly based on looking at the previous standard algorithms and techniques to modify or if not create new ones:

Ø  DCT structure in conjunction with the motion compensation of previous versions was efficient enough so there was no need to make fundamental changes in its structure.

Ø  Scalable Video Coding: An important advance because it allows each user, regardless of the limitations of the device, receives the best possible quality, issuing only a single signal. This is possible because it provides a compressed stream of video and users can take only what you need to get a better video quality according to their technical limitations of receipt.



The MPEG-4 has more complex algorithms and better benefits giving a special quality improvement, which provides a higher compression rate than MPEG-2 for an equivalent quality.



MAIN ADVANTAGES

                  

For the MPEG-4 AVC the main important features are:

1. Provides almost DVD quality video, but uses lower bit rate so that it's feasible to transmit digitized video streams in LAN, and also in WAN, where bandwidth is more critical, and hard to guarantee.

2. Dramatically advances audio and video compression, enabling the distribution of content and services from low bandwidths to high-definition quality across broadcast, broadband, wireless and packaged media.

3. Provides a standardized framework for many other forms of media — including text, pictures, animation, 2D and 3D objects – which can be presented in interactive and personalized media experiences.

4. Supports the diversity of the future content market.                                   

5. Offers a variety of so-called “profiles,” tool sets from the toolbox, useful for specific applications, like in audio-video coding, simple visual or advanced simple visual profile, so users need only implement the profiles that support the functionality required.

6. Uses DCT algorithm mixed with motion compensation. It clearly shows that MPEG4 wants to be a content-based representation standard independent of any specific coding technology, bit rate, scene type of content, etc. This means it shows at the same time why and how MPEG4 is different from     previous moving pictures coding standards.

8. Low latency



            The most important and relevant are:

1. Reduces the amount of storage needed

2. Increases the amount of time video can be stored

3. Reduces the network bandwidth used by the surveillance system

No comments:

Post a Comment