What is in this article?:
- The Forum: Comparing loudness meters, part 2
- Problems with low peak-to-RMS ratio material
- Studies indicating that BS.1770 is inaccurate at very low frequencies
- Discussion and conclusions
Problems with low peak-to-RMS ratio material
In the subjective testing to validate the BS.1770 meter, there were outliers as large as 6dB (i.e., the meter disagreed with human subjective perception by as much as 6dB11). The subjective testing to validate the CBS meter found outliers up to 3dB, although fewer items were used in this testing. We hypothesize that the fact that the worst-case error of the BS.1770 meter was substantially larger than that of the CBS meter is caused by the BS.1770 meter’s not modeling loudness summation or the loudness integration time constants of human hearing. BS.1770-2 states:
It should be noted that while this algorithm has been shown to be effective for use on audio programs that are typical of broadcast content, the algorithm is not, in general, suitable for use to estimate the subjective loudness of pure tones.
We have noted that the meter tends to over-indicate the loudness of program material that had been subject to large amounts of “artistic” dynamic compression, as is often done for commercials and promotional material. In other words, the meter over-indicates the loudness of program material having an unusually low peak-to-average ratio, which, at the limit, approaches the peak-to-average ratio of a pure tone.
We have encountered heated complaints by mixers12 and producers who stated that such material, when “matched” to the loudness of the surrounding program material via the BS.1770 meter, is considerably quieter in subjective terms. In turn, this has constrained the ability of producers to specify the type of audio processing they had previously used to give this material excitement and punch. We hypothesize that this problem is related to the fact that BS.1770 does not accurately indicate the loudness of pure tones.
Some studies have indicated that when people are asked to assess the loudness of a given piece of material, they state that it sounds louder when underscoring or effects are added to constant-level dialog. The EBU has used these studies to justify the position taken in R 128 that a listener’s impression of total loudness is more important than dialog level13. In our opinion, this misses the point. A more relevant question is whether viewers would want to turn down their volume controls to make dialog quieter when underscoring and effects appear. (In other words, whether effective TV commercial loudness control requires nothing more than applying gain control to commercials such that the BS.1770-2 “short-term” loudness14 is always limited to 0 LK.)
Orban and Dolby Labs hold similar views. We believe that dialog is the most important element in most television audio and that listeners do not want to turn down their volume controls every time that underscoring or effects appear under the dialog. The popular Dolby LM100 loudness meter15 in its current revision uses the same Leq(RLB) algorithm as BS.1770 but adds gating to eliminate non-speech material, including silence. The author has used the Dolby LM100 to measure the output of the Orban 8685 with a wide variety of speech material, and has observed that this material is almost always controlled within a ±1dB window as measured on the LM100.
This demonstrates the benefits of a dialog-centric measurement. Moreover, the author believes it is unwise to rely on a BS.1770 measurement to set the on-air loudness of unadorned dialog because this can cause the dialog to be too loud with respect to other material. The author has experimented with “inverse short-term BS.1770 loudness control” and believes that it sounds unnatural, pumping dialog loudness up and down in a subtly inartistic way as underscoring and effects come and go.16