MPEG-2 and H.264/AVC
Jul 1, 2008 12:00 PM, By Steve Mullen
What is the role of GOP length in HD recording technology?
The introduction of recording equipment employing H.264, with Fidelity Range Extensions (FRExt), has created a viable alternative to equipment using long-GOP MPEG-2. The MPEG-4, Part 10, H.264/AVC codec is currently available in two recording implementations: AVC-Intra (Panasonic) and AVCHD (Canon, Panasonic and Sony).
The AVC-Intra, for professional use, employs intraframe encoding while the latter, currently for consumer use, employs interframe encoding. Although not versions of the same codec, both are marketed with similar claims of being twice as efficient as MPEG-2. To check this claim's validity, we will examine both MPEG-2 and H.264 encoding.
MPEG-2: frame encoding
After appropriate filtering to reduce noise, a picture is partitioned into 16 × 16 pixel macroblocks. A discrete cosine transform (DCT) is applied to each of the four 8 × 8 luminance blocks within a macroblock. The DCT organizes information so the level of detail to which the human eye is most sensitive is not discarded. Conversely, very fine detail will be discarded first.
Compression occurs through quantization of coefficients from the DCT. Quantizing reduces the number of bits representing each coefficient. After quantization, further data reduction is applied using variable length coding (VLC) and run length coding (RLC). The result is an MPEG-2 I-frame.
These steps are essentially the same for MPEG-2 and DVC (DV, DVCAM and DVCPRO) encoders.
H.264: slice encoding
H.264 intra-encoding employs techniques that reduce spatial redundancies. Redundancies exist because some image areas are naturally correlated with other image areas.
The process begins by partitioning an incoming picture into 16 × 16 pixel macroblocks. Each macroblock is further partitioned into 16 4 × 4 pixel submacroblocks. The encoder uses the former block size for gross detail and the latter for fine detail.
Previous (reference) blocks are used as a source of reference pixels. These blocks will have already been encoded and decoded. (H.264 supports an adaptive deblocking filter that attenuates compression blocking artifacts and operates during encoding and decoding.)
Reference pixels are located at the left and upper boundaries between previous blocks and the current block. Predictions are made for 4 × 4 or 8 × 8 blocks using nine prediction modes. (See Figure 1.) The 16 × 16 macroblock predictions are made using four prediction modes. (See Figure 2.)
The mode that best predicts the content of the current block is selected as the current mode. This mode is used to generate a predicted block from the reference pixels. (See Figure 3.)
Figure 3
A residual (error) block, computed as the difference between the predicted block and the current block, is then integer transformed. H.264 employs a 4 × 4 or 8 × 8 integer transform rather than a DCT transform.(Integer transforms prevent mismatches between encoders and decoders.) Next, the results from the transform are quantized and entropy coded.
While MPEG-2 uses variable length coding (VLC) to further reduce data, H.264 employs context-adaptive binary arithmetic coding (CABAC) or context-adaptive variable-length coding (CAVLC) entropy coding. CAVLC and CABAC are supported by the Main, High-10 and High-422 Profiles.
When grouped together, intra-encoded blocks yield an I-slice. A picture — a field or frame — is encoded as one or more slices, up to a maximum of eight slices.
AVC-Intra codecs
Panasonic offers two AVC-Intra codecs as alternatives to its DVCPRO HD codec on selected P2-based devices: the 50Mb/s High-10 Profile (Hi10P) codec and the 100Mb/s High-422 Profile (H422P) codec.
Several caveats apply to Panasonic's AVC-Intra. First, as shown in Table 1, these codecs encode dissimilar numbers of luma and chroma samples — with different sample widths. Nevertheless, after equalizing these data, the H.264 block prediction tools provide nearly twice (≈1.87X) the efficiency of the DVC-based DVCPRO HD codec — or I-frame-only MPEG-2.
Second, the claim of AVC-Intra's 2X greater efficiency does not apply to long-GOP MPEG-2. Long-GOP MPEG-2 is about 160 percent more efficient than intra H.264, which is why 1920 × 1080, 4:2:2, 8-bit video can be encoded using XDCAM HD 422 at only 50Mb/s.
MPEG-2: P- and B-frame encoding
A newly encoded I-frame is decoded to regenerate an initial picture, which is divided into 16 × 16 pixel macroblocks. Starting with the upper, leftmost macroblock, a search is made to determine its location in the next picture. A correlation technique measures how closely the macroblock matches each searched macroblock.
In a methodical pattern, the macroblock is moved in all directions at increasing distances from its origin. The displacement — direction and distance moved when a match is made — becomes the macroblock's motion vector. This process is repeated for each macroblock until all motion vectors have been computed and saved in a motion compensation block. Next, a predicted frame is constructed using these vectors applied to the initial picture.
blog comments powered by Disqus
| Want to use this article? Click here for options! |





















