Dialnorm: A good idea gone bad?
Dec 1, 2007 12:00 PM, By Bruce Jacobs
How is the most important audio parameter for ATSC transmission so often maligned and misused? A little byte of audio metadata in the DTV AC-3 stream dialnorm was made into a standard with good intentions but from the beginning, it has been in a state of disrepair.
The analog curse
Analog TV audio levels are maximized by the marketplace and capped to avoid exceeding FCC modulation limits. This results in a limited dynamic range a squashed sound. There's no opportunity for a dramatic moment. Movies sound lame. Symphonies sound anemic.
With the upper limit determined by a peak-reading meter, the loudness for consumers is inconsistent. Complex waveforms are made softer than simpler ones in order to avoid an FCC fine, to the disadvantage of the listener.
To make matters worse, high frequencies are compressed even more, causing the audio to sound dull. This is necessary to avoid overmodulation from the pre-emphasis that was included in the FCC transmitter rules back when audio didn't have much high-frequency content. There was a time when the resulting reduction in noise from a matching receiver de-emphasis seemed like a good idea.
Right off the bat, digital audio is better than FM audio, because there is no need for pre-emphasis. This eliminates dull-sounding audio!
But how do we manage levels in the digital age? A bad solution would have been to let the marketplace decide as was done with the compact disc and MP3 files. The upper limit would be simple; the highest digital number is the highest peak value, where anything higher is clipped. The lower limit is 96dB down, leaving plenty of dynamic range available for the producer to keep average levels low and avoid clipping. But this approach results in a lose-lose loudness war just like with analog broadcasting and increasingly with digital audio files. Everybody tries to be the loudest. Everyone loses dynamic range. For digital television, there must be a better way.
The good idea
In developing the AC-3 compression system for movies, Dolby's engineers rightfully wanted to give the home listener the same benefit enjoyed in the theater a consistent dialog level and a wide dynamic range.
The consistent dialog level is achieved by the use of a long-term averaging meter that is A-weighted to favor the frequencies in which our ears are most sensitive at low levels. Movie audio levels are adjusted so the average weighted dialog level remains consistent, pleasing both the listeners (who can better hear the dialog) and the theater owners (who get fewer complaints about trailers being too loud).
Understandably, a movie is often mixed so that the level of explosions and music crescendos exceed the average dialog level by a significant amount. This helps make a movie exciting!
Dolby could have picked a fixed average dialog level for AC-3. The specified dialog level could have been chosen a safe number of decibels below 0dBFS, leaving room for dramatic peaks. This level could have been adopted by the FCC, along with the AC3, within the ATSC standard.
Broadcasters could have adjusted their dialog levels using the appropriate metering to the level specified. New meters meeting the standard would become available. Hopefully, legacy content would have audio levels close to the chosen value. If not, processors could keep levels within bounds.
The idea of giving the consumer consistent dialog level and a wider dynamic range would have been achieved. Life for the broadcaster would have been simple. Life for the consumer would have been improved. But this is not what Dolby did.
The trouble begins
Nobody likes limits. Who chooses the limit? Should Dolby have designed AC-3 to suit the film industry or the broadcast industry?
Rather than specify a fixed amount of dynamic range, Dolby made it adjustable from 1dB to 31dB. This was accomplished by including a special data parameter that remotely controls the output gain of all final AC-3 decoders. (See Figure 1.) Every consumer decoder must apply this adjustment under terms of the Dolby license. This is the parameter dialnorm.
If the mix engineer wants the largest possible dynamic range, the dialog is mixed to a level of -31dBFS, and the dialnorm is logically set to -31. This results in unity gain at the decoder. If the mix engineer wants less dynamic range, a higher dialog level is chosen along with a dialnorm value of the same numeric value, resulting in a decoder gain reduction of the appropriate amount. This approach keeps dialog levels consistent from movie to movie, from show to show and from channel to channel. If dialnorm is set properly, the average dialog level from the decoder will be -31dBFS when measured with the averaging Aweighted meter.
There is no one right dialnorm value. It depends. Just because Dolby ships encoders set to -27 doesn't mean this is the correct value for your station. The correct value is the average dialog level on the input of your AC-3 encoder. This is where the good idea starts to go bad.
Metadata madness
Nobody likes complexity or confusion. Here's what happened after the ATSC adopted AC-3:
-
Encoders came with an obscure knob that could be set between -1 and -31, where increasing the value makes the audio in every home illogically softer!
-
The shipped encoders had the knob set to -27 without saying why, leading many broadcasters to think this is the correct value. This left other broadcasters with the impression that they need to set dialnorm individually for every show in their library a literally impossible task.
-
Fix dialnorm
Broadcasters were told they needed to build systems to carry the metadata through their SD plant and storage system when no equipment existed to make it practical to do so.
-
This technology was introduced in an environment with no enforced standards for the analog portion of consumer equipment. A proper dialnorm setting can result in digital dialog levels that are below analog dialog levels.
All this complexity and confusion has caused listeners in many markets to report DTV audio levels much less consistent than the analog counterpart channels. Stations serving some of the major networks routinely transmit dialnorm values far from the actual dialog level, resulting in their programming appearing significantly louder than other networks. Stations offering multicast channels sometimes deliver widely varying audio levels even on their own channels. With DTV, consumers have better pictures but more annoying audio. Varying audio level is the number one complaint. The public deserves better, according to Jim Kutzner, chief engineer at PBS. It is a sorry state of affairs. Wasn't digital was supposed to make things better?
If the broadcast industry is going to achieve a consistent dialog level and wide dynamic range, it has work to do.
| Want to use this article? Click here for options! |






















