MPEG-2 to H.264 TRANSCODING: Why AND how?
Dec 1, 2006 12:00 PM, BY SANTHANA KRISHNAMACHARI AND KYEONG HO YANG
Transcoding challenges
This article primarily focuses on the issues and challenges associated with format transcoding from MPEG-2 to H.264. Although MPEG-2 and H.264 use similar techniques of motion compensation, transformation, quantization and entropy coding, there are several basic differences between the two standards that make the transcoding operation challenging.
Several new features available in H.264, such as multiple reference frames, smaller block shapes and spatial intra prediction, have no corresponding information in the MPEG-2 bit stream. The use of spatial prediction in I-slices in H.264 makes the transcoding of the MPEG-2 I-frame substantially more complex than the simple re-quantization techniques that have been used by the MPEG-2 rate transcoders. The approach to transcoding MPEG-2 to H.264 is expected to progressively follow three approaches, which are presented below.
Decode and re-encode
The simplest approach to transcoding is to completely decode the MPEG-2 bit stream and then re-encode it with an H.264 encoder. The decode operation can be performed either externally or as a part of the H.264 encoder. System issues, such as handling SCTE-35 digital program insertion (DPI) messages, will require that the decode and encode operations be tightly coupled.
The quality of transcoding with this simple approach will not be high. Figure 3 shows a comparison between direct encoding and transcoding. The figure shows the PSNR (a measure of mean square error between the input and decoded output) values computed at different bit rates. The PSNR numbers are obtained by averaging the results over 18 different sequences of varying content type and complexities. The top plot shows the performance of direct encoding using an H.264 encoder. The bottom plot shows the performance of transcoding where the video is originally coded with MPEG-2 at 4Mb/s, decoded and then re-encoded with the same encoder used for direct encoding. Transcoding can result in up to 20 percent loss in compression efficiency.
Similar to the previous approach, the incoming MPEG-2 stream is decoded and then re-encoded using an H.264 encoder. However, here the relevant information available from the MPEG-2 bit stream is reused.
Decode and information reuse
Although there are significant differences between MPEG-2 and H.264, including block shapes for motion compensation, block sizes for transformation and motion search ranges, there is still useful information available in the input MPEG-2 bit stream that can be exploited by the H.264 encoder to improve transcoding quality and reduce computational complexity.
Reusing the picture type (I, P or B) information from the MPEG-2 bit stream can provide substantial improvement in transcoding quality. Because MPEG-2 encoders code I- and P-pictures at a higher quality than B-pictures, better transcoding efficiency can be achieved if the H.264 encoder can align the picture type with that of the input stream.
Other information such as motion vector values and coding mode decisions can be reused to reduce complexity of transcoding. The H.264 encoder can use the quantizer values and the number of bits used to encode a given picture obtained from the input MPEG-2 stream for bit allocation and rate control decisions. Reuse of information as described here can be similar to two-pass encoding, where the results of the first pass of encoding are used to drive the decisions in the second pass.
Transform domain processing
Transform domain processing is commonly used in the MPEG-2 bit rate transcoding applications mainly to reduce computational complexity and to avoid the loss of accuracy due to repeated DCT and inverse DCT operation.
With the use of integer transforms in H.264, there is no penalty because of repeated forward and inverse transformation operations. Performing complete transcoding in the transform domain may be unrealistic because of the substantial differences between MPEG-2 and H.264. However, computational complexity reduction can be achieved in certain operations, such as the I-slice transcoding in the transform domain by combining the inverse DCT operation in MPEG-2 with the forward integer transform of H.264.
Conclusion
Coexistence of various coding standards, and the requirement for multiple resolutions and frame rates for new emerging applications, will drive the need for efficient, high-density transcoding. Transcoders are expected to progress from simple decode/re-encode devices to more complex integrated systems that reuse information in the input bit stream and achieve higher density by employing selective transform domain processing techniques.
Santhana Krishnamachari is vice president of advanced engineering and Kyeong Ho Yang is technical manager of video algorithms group for EGT.
| Want to use this article? Click here for options! |






















