Long-GOP editing
Jul 1, 2009 12:00 PM, By Steve Mullen
Today's enhanced NLEs support various types of MPEG-2, resulting in a faster workflow.
Product presentations at NAB are a great way to rest your burning feet while looking like you are absorbing significant information. At the 2009 NAB Show, in a state somewhere before sleep, I heard a presenter tie long-GOP formats to generational loss caused by adding titles to a production. Say what?
While the industry may have moved past a pronouncement made in 2004 that “…MPEG-2 ain't supposed to be edited,” the NAB 2009 statement sadly matched comments made over the last five years — often by Hollywood influencers — that reveal a profound misunderstanding about how modern editing tools work.
A few years ago, negative statements about long GOP were simply veiled attacks on MPEG-2. But, now with AVCHD and AVCCAM — both long-GOP versions of MPEG-4 — claims about GOP length and editing cover a wide range of formats and products. By addressing the myths surrounding MPEG-2, we can hopefully prevent the same myths resurfacing with long-GOP H.264/AVC.
Performance concerns
Some myths about interframe encoding may result from explaining, in simple terms, how encoding works. A key frame often was described as though it were a photograph, with no mention that an I-frame itself is highly compressed. Using terms that better fit a description of delta-modulation, subsequent frames were described as containing differences from the initial frame.
It's not surprising that those who conceive of computer-based editing as nothing more than the replacement of VTRs with hard disks were sure the need to keep “rewinding” the disk files to find I-frames would make jog and shuttle sluggish and, therefore, a serious hindrance to editing.
In reality, NLEs access a disk only to replenish large buffers held in RAM. Moreover, B and P frames are stored within a GOP in a series that facilitates decoding. (For a brief review of interframe and intraframe MPEG-2 and MPEG-4 encoding, see the “MPEG-2 and H.264/AVC” article in the July 2008 issue.)
A more realistic concern was that the enormous number of calculations required to obtain each image would prevent multistream real-time editing. However, as reported in my review of a RAID system (“CalDigit's HDPro” in the September 2008 issue), I measured nine streams of 1920 × 1080 XDCAM EX from a Mac Book Pro. These kinds of numbers effectively refute this concern.
Unfortunately, this concern remains true for AVCHD and AVCCAM, as well as AVC-Intra. Nevertheless, there will be a day when H.264/AVC performance concerns will vanish.
Native vs. intermediate editing
There have always been warnings about long-GOP MPEG-2, such as the one I heard at NAB. Although they sound reasonable, they involve invalid assumptions. At heart is the belief that because long-GOP MPEG-2 is highly compressed, you'll probably want to convert that MPEG-2 stream to something else before you do any editing or compositing work.
Converting MPEG-2 (or H.264/AVC) to an intermediate codec — other than uncompressed — results in at least some quality loss because it involves a decode followed by a recompression using an intermediate codec. Moreover, conversion always increases the size of all your source files because interframe source files require the least possible storage space. But, more importantly, conversion during import in no way can improve or preserve image quality. Even with a conversion to uncompressed video, image quality only remains constant.
Additional warnings and recommendations involve reference to the evils of 4:2:0 chroma sampling and generation loss caused by multiple re-encodes of long-GOP files to long-GOP files. First, it is important to note that although most long-GOP formats have employed 4:2:0 sampling, this is not an inherent characteristic of interframe encoding. For example, 50Mb/s Sony HDCAM 422HD is a long-GOP format.
Second, the quality of 4:2:0 sampling is not the subject of this debate. Rather, the question is at what point in the editing process 4:2:0 video is upsampled to either 4:2:2 or 4:4:4. A 4:2:2 conversion is made so various video formats can be mixed together. To mix RGB graphics with video, a conversion to 4:4:4 is performed.
A conversion can be made within a VTR when MPEG-2 is decoded prior to being sent as uncompressed 4:2:2 video over an HD-SDI connection. Another option is to perform the conversion during import when MPEG-2 or AVCHD/AVCCAM is transcoded to an intermediate codec. And, of course, the conversion can be made on the fly as MPEG-2 or AVCHD is decoded to an uncompressed YCrCb signal. In all cases, the key to upconversion quality is the equations themselves and the degree to which rounding errors are prevented. The point at which the conversion occurs is irrelevant. (See Figure 1.)
Early NLE operation
Early NLEs generated effects following this process: One or more sources were decompressed to 4:2:2 YCrCb. The digital data stream(s) was mathematically (dissolves) or logically (wipes) combined. The resulting 4:2:2 YCrCb data were then recompressed using the same codec used by the source files. The recompressed files were called preview, render or precompute files.
Using this procedure, long-GOP video was subjected to generation loss during recompression. Moreover, graphics and titles would also be subjected to long-GOP compression that would indeed cause significant graphics degradation.
In order to save rendering time, many NLEs would reuse the render files when an editor, for example, added another layer to the layers already rendered. Likewise, to save compute time, renders were used when a project was exported to another format.
If modern professional NLEs such Avid Media Composer and Apple Final Cut Pro worked in the manner described, generation loss would indeed be a valid concern. Thankfully, they do not.
| Want to use this article? Click here for options! |































