ATSC tackles audio loudness
Mar 1, 2010 12:00 PM, By Jim Starzynski and J. Patrick Waddell
Adhering to new recommended practices can help you maintain audio consistency.
The ATSC has published a new Recommended Practice (RP) that addresses the large variation in loudness among programs, commercials and other interstitial elements. The new document is A/85, “Techniques for establishing and maintaining audio loudness for digital television.” A/85 covers all facets of the audio delivery system, from implementation of the key ATSC standards to mix room monitoring and the consumer experience. It also includes “Quick Reference Guides” to get operators and content creators up to speed on critical information, as well as links to audio test signals that can be used for monitoring environment setup.
Loudness variations
Adhering to new recommended practices can help you maintain audio consistency.
Select figure to enlarge.
Despite the conclusion of the DTV transition, many broadcasters and the production community have been slow to effectively adapt to the changes required to transition from analog NTSC audio techniques to contemporary digital audio practices. With digital television's expanded aural dynamic range (over 100dB) comes the opportunity for excessive variation in content when DTV loudness is not managed properly.
Consumers do not expect large changes in audio loudness from program to interstitials and from channel to channel. Inappropriate use of the available wide dynamic range has led to consumer complaints, which eventually reached Congress.
The NTSC analog TV system uses conventional audio dynamic range processing at various stages of the signal path to manage audio loudness for broadcasts. This practice compensates for limitations in the dynamic range of analog equipment and controls the various loudness levels of audio received from suppliers. It also helps smooth the loudness of program-to-interstitial transitions. Though simple and effective, this practice permanently reduces dynamic range and changes the audio before it reaches the audience. It modifies the characteristics of the original sound, altering it from what the program provider intended to fit within the limitations of the analog system.
The AC-3 audio system defined in the ATSC digital television standard uses metadata, or data about the data, to control loudness and other audio parameters more effectively without permanently altering the dynamic range of the content. The content provider or DTV operator encodes metadata along with the audio. From the audience's perspective, the dialog normalization (dialnorm) metadata parameter sets different content to a uniform loudness transparently. It achieves results similar to a viewer using a remote control to set a comfortable volume between disparate TV programs, commercials and channel-changing transitions. The dialnorm and other metadata parameters are integral to the AC-3 audio bit stream.
It is important for the digital television system to provide uniform subjective loudness for all audio content. Consumers find it annoying when audio levels vary between channels and on a single channel. Dialog, the spoken word, has been identified as the element that audiences typically adjust their volume to. Achieving an approximate match for average dialog level from all content is a desirable goal. While the AC-3 audio specifications in ATSC Standard A/52, “Digital Audio Compression (AC-3, E-AC3) Standard,” provide syntax that makes this goal achievable, system implementation in the real world has proven more difficult than expected.
Addressing the loudness issue encompasses several elements, which include mixing; monitoring; and proper encoding of local and network programs, commercials, promos and other content. The S6-3 study group explored all facets of DTV loudness, with a goal to identify problem areas and recommend practical solutions.
The industry has recognized that a new proficiency in loudness measurement, production monitoring, metadata usage and contemporary dynamic range practices is critical for meeting the expectations of the content supplier, the broadcaster, the audience and governing bodies.
The AC-3 audio system
The ATSC AC-3 audio system intends to deliver a reproduction of the original (unprocessed) content at the output of the AC-3 decoder in a receiver, normalized to a uniform loudness. It provides the ability for broadcasters to allow each listener the freedom to exert some control over the degree of dynamic range reduction, if any, that best suits his or her listening conditions.
The metadata parameter dialnorm is transmitted to the AC-3 decoder along with the encoded audio. The value of the dialnorm parameter indicates the loudness of the anchor element of the content. The dialnorm value of a very loud program might be 15, and of a soft one, 27. There is an attenuator at the output of the AC-3 decoder that applies appropriate attenuation to normalize the content loudness so all content is normalized to the same level without compromising dynamic range.
If the dialnorm metadata parameter accurately reflects the overall loudness of the content, then listeners will be able to set their volume controls to their preferred listening (loudness) level and will not have to change the volume when the audio changes from program to advertisement and back again. If all broadcasters use the system properly, the loudness will also be consistent across channels.
There are three methods of using audio metadata: fixed, preset and agile. Any one of these approaches will deliver consistent loudness to the listeners. A broadcaster should use the method that best suits its operational practices. Whichever approach is selected, the system depends on transmitting a value of dialnorm that correctly represents the loudness of the content, which depends in turn on accurate measurements.
Loudness measurement
Because loudness is a subjective phenomenon, human hearing is the best judge of loudness. When combined with a known mixing environment, experienced audio mixers using their sense of hearing can produce a program with remarkably consistent loudness. If all programs and commercials are produced with consistent loudness — and if the loudness of the mix is preserved through the production, distribution and delivery chain — listeners will not be subjected to annoying changes in loudness within and between programs.
When measuring audio signals, there are two key parameters of interest: the true peak level of the signal and its loudness. The true peak measurement enables the mixer to protect the program from clipping, and the loudness measurement allows the mixer to protect the listener from annoying variations in loudness. Although the mixer balances a mix using his or her hearing, an objective loudness measurement helps to maintain consistent loudness within and between programs.
| Want to use this article? Click here for options! |





















