When discussing technical aspects of television, audio is often taken for granted. However, if you ever worked in radio, you know that television is just radio with pictures, right? With digital television, we tend to think of video stream transports, bandwidths measured in MHz and MPEG encoding schemes. The audio is simply a part of the media transport stream that everyone expects to be perfect.
Digital audio technology is a decade or two ahead of digital video. Prototype digital audio CDs were created around 1980. Of all the major manufacturers at the time, such as 3M, Soundstream, JVC, Mitsubishi and others, only the Sony+Philips consortium was working on a digital home delivery system. Philips began it as a 500-day timeline project and brought in Sony for its electronics, while Philips worked on the laser optics. It’s still the Philips Red Book spec that is followed for audio CDs.
At about the same time, Sony introduced the PCM-1600, the world’s first professional digital audio recording system. The PCM-1600 was the first system used for mastering audio CDs. It used digital conversion and pulse code modulation (PCM) in conjunction with a tricked-out Sony Broadcast U-matic for transport. The PCM-1600 weighed about 150lbs, and the companion BVU-200B U-matic wasn’t much lighter. I know because I helped deliver one of the first PCM-1600 systems to Stevie Wonder’s studios in Los Angeles in 1979. The “B” version of the BVU-200 was modified to move the head switch to the vertical interval and had the drop out compensator and chroma switched off. Typically, the BVU-200Bs were sold in pairs for editing. The biggest problem with the early PCM/VCR systems was tape dropouts.
A PCM adaptor converted analog inputs into digital audio, encoded as pseudo-video so the signal could be recorded on an analog video tape recorder. The number of NTSC video lines, frame rate and bits per line dictated a sampling frequency of 44.1kHz. This number was used because the Nyquist Thoerum states the sampling frequency must be double the audio bandwidth. 44.1kHz should also sound familiar because it is not coincidently the sampling rate for audio CDs. The audio CD was specifically designed to have a running time of up to 74 minutes so it could accommodate Beethoven’s 5th uninterrupted, as per Dr. Toshi Doi of Sony.
It didn’t take long for the Audio Engineering Society (AES) and the European Broadcasting Union (EBU) to take note of this trend. The result was the development of an international standard for two-channel PCM digital audio, formally known as AES3, also called AES/EBU. Note that one AES3 channel contains two channels of audio, often a stereo pair.
AES/EBU was developed to carry the two-channel audio data with an embedded word clock. Originally, Sony carried the data over separate wires with separate word clock, which was a problem for wiring. AES/EBU ins and outs were balanced with transformers so it could be transmitted over longer distances than an unbalanced signal, which degraded quickly. The unbalanced equivalent is S/PDIF, which is essentially a lower voltage unbalanced AES/EBU signal for consumer use.
While PCM digital audio and AES/EBU were two-channel systems, artists and recording studios wanted multitrack digital audio systems. A number of manufacturers introduced a variety of professional multitrack audio tape recorders. Some used spinning heads; others used disk drives or data recorders. But, because they were multichannel, they all used multiple (typically XLR) cables and snakes, no different than their analog counterparts.
As digital audio technology advanced and manufacturers were looking for clarity, the AES developed a standard for multichannel transport over a single cable or fiber. In 1991, the AES announced AES10, also known as Multichannel Audio Digital Interface, or MADI. In 2003, AES10-2003 was added to include the option of eliminating the variable speed feature to increase the number of channels from 56 to 64. AES10-2003 also accommodated 96kHz sampling. The latest version of MADI is AES10if-2005 (r2011), which was revised in 2011. Because of these various MADI improvements, some older MADI gear may not be fully compatible with the latest gear.
One of the primary benefits of MADI for broadcasters is its simplicity and small footprint in large routing systems. The AES estimates there are more MADI interfaces for multitrack audio routing than there are for audio console to multitrack recorder connections.
MADI was introduced before embedded SDI. Embedded SD-SDI, HD-SDI and MADI are all compatible with AES3. SDI is limited to 16, 48kHz, 24-bit channels, and is transmitted during horizontal blanking intervals on a single piece of coax along with the video. MADI is audio-only and typically lots of it.
MADI is a time division multiplexed, unidirectional, multiple channel digital audio transmission standard. It is based on multiplexing multiple AES3 streams on one coaxial cable or fiber. AES10 details the transport “of 32, 56 or 64 channels of linearly represented digital audio data at a common sampling frequency within the range of 32kHz to 96kHz, having a resolution of up to 24-bits per channel.” The number of channels is inversely proportional to the sampling frequency. At a 48kHz rate, MADI can support up to 32 channels. At a 96kHz rate, the maximum number of channels is limited to 16 multiplexed AES3 streams. Typically at a 48kHz sampling rate, MADI can multiplex 32 AES3 channels.
MADI specifies 75-ohm cable with BNC connectors or fiber-optic cables with STI connectors. The signal on a properly terminated MADI coax connection is specified to be between 0.3V and 0.6V peak to peak. MADI specifications also set asynchronous point-to-point communications at a maximum of 100Mb/s, which makes it possible to distribute over a LAN.
MADI over fiber, according to AES10, “should be graded-index fiber within a core diameter of 62.5mm, a nominal cladding diameter of 125mm, and a numerical aperture of 0.275 at a wavelength of 1300nM.” The fiber interface must comply with ISO/IEC 9314-3 specs. Compliance makes it possible for a reliable MADI connection up to approximately 1.8mi.
One MADI frame carries a sequence of 64 or fewer sub frames. One sub frame is 32 bits, which includes the standard 28-bit channel word (except preamble), as specified in AES3. Bits zero through 3 are used for frame sync, block start, channel ID and channel activity. Typically, channels A and B translate to left or right. In AES3, channel A is sub frame 1, and channel B is sub frame 2. If a channel is inactive, all its bits are set to zero.
Each sub frame also holds one audio sample and a Validity bit (V), a User bit (U), a Channel status bit (C) and a Parity bit (P) from one audio channel. While MADI audio data is essentially multiplexed AES3, there is one big difference. MADI frame synchronization is triggered by sync symbols outside the AES3 stream (bits zero to 3).
MADI standards call for providing an independently distributed master sync signal to each MADI receiver and MADI transmitter, as defined by AES11. In other words, the timing of MADI equipment is controlled by the distributed master sync signal and not by MADI. AES standards and specifications are copyrighted. Further details are available at http://www.aes.org/publications/standards/search.cfm.
The author wishes to thank John Moran of John Moran Mastering, Houston TX, for his assistance in preparing this tutorial.