File-based acquisition
Jun 1, 2009 12:00 PM, By Nigel Arnott
Acquisition formats should be a primary concern in file-based environments.
The file-based environment has become highly desirable, and there are very compelling reasons for this. Once we break away from real-time transfers, then we open up the possibility of new workflow where content moves swiftly from place to place.
The reality of the situation is rather different, because it is not quite as simple as it seems. We all know what an analog or SDI signal looks like, but when you get into file-based systems, there are many different varieties of files.
Variations and standards
One leading asset management system has more than 180 different flavors in its library. This is certainly not even all the possible permutations of video format, codec, bit rate and wrapper.
Some of these variations are designed for different applications. A self-contained QuickTime sequence, for example, has three variants. MXF uses the frame wrapper, which contains a video frame with its associated audio, then the next video frame and audio, and so on. That makes it ideal for playout, where you might want to start and stop at any point within the sequence.
But it is challenging to write in a low-power device, so many ENG cameras use the clip wrapper, which writes all the video frames in a shot followed by the audio tracks. This is also supported by MXF.
There's also a compromise version, the mixed wrapper, which has a block of video frames followed by the audio, then the next block of video frames, and so on. Final Cut Pro usually produces QuickTime in this form.
Standards would certainly help in this area. MXF was the original attempt, but it has not yet been fully successful, largely because the original incarnation was drawn so widely that files generated by one device could be “MXF compliant” and utterly incomprehensible to another vendor's “MXF-compliant” system.
The work done by the consortium driven by Turner Broadcasting has positioned MXF as a more useful format, but it is still not universally recognized.
There are other standards in the pipeline, too, such as MPEG-7, which defines technical metadata in the media and how systems can store information automatically from derived technical information. True interoperability — of the kind we had when BNC to BNC or XLR to XLR always worked — depends on the widespread adoption of standards.
Until then, systems integrators are faced with rewrapping and transcoding between different devices and various stages of the content pipeline. These processes take time and can cause degradation of the signal. Both are blocks to the seamless workflow that is the promise of the file-based environment.
The best-case scenario would surely be to avoid transcoding anywhere in the system, but that is relatively unfeasible. For now, we have to acknowledge that the acquisition codec, the editing codec and the transmission codec are going to be different and optimized for each area.
Acquisition
It is a fundamental principle understood by broadcasters that quality lost at the beginning of a production chain can never be recovered, so acquisition should use the best possible codec.
If we ignore the constraints of file-based systems, probably the best practical acquisition format for broadcast video is 10-bit 4:2:2.
So why, then, would we compromise on that quality just because it is being recorded as a file rather than onto linear tape? Advancing technology should not mean deteriorating quality.
Content captured at 10-bit 4:2:2 HD can be acquired using AVC-Intra or JPEG2000. Quantization at 8 bits, or even more drastic color subsampling, will produce visibly inferior image quality and will cause problems in green-screen work.
The difference between AVC-Intra and JPEG2000 is that the first is an MPEG-type codec based on discrete cosine transforms (DCT), and the latter uses wavelet compression. AVC-Intra can also use temporal compression, using information from one video frame as the basis of other frames in a group of pictures, whereas JPEG2000 is an intraframe compression scheme in which each individual frame is complete in itself.
The DCT algorithms implemented in MPEG and AVC-Intra split the picture into blocks and process each individually. When the encoder or decoder comes under stress, these blocks can become visible. Wavelet encoding, on the other hand, processes the entire picture as a single entity (up to 4K by 4K resolution in JPEG2000), so blocking is impossible. And if there is any stress on the compression engine, the result is a less obtrusive softening of the picture.
One consideration we have to make when selecting the compression algorithm is that of basic science. The more effective the compression, then the more processing is required. Because Moore's Law helps us with more power in smaller chips, this is less of an issue, but it needs to be in mind when balancing considerations in the choice of a camera. Better quality means more processing, which means a bigger (heavier) battery and potentially more heat to be managed.
| Want to use this article? Click here for options! |

















