ciraPTer i6 Analog and Digital Processing The mastering engineer must recognize when a recording is so good that the interests of the client are best served simply by leaving it alone. And there are recordings for which so little work is needed that the gains due to processing would not warrant the losses due to the same processing! For although equipment is getting better, there is no such thing as a transparent audio processor. This chapter is about howwre measure and interpret performance, as there is an interaction between object ivc degradation and subjective improvement. Let's take a journey into the twilight zone between the objective and the subjective. I. The Ironies of Perception vs. Measurement Although we'll be using test measurements, we must remember that each single measurement only provides a small part of the picture. An audio processor is like an object inside a house with no doors, only a number of small windows that you can peer into. By looking at the object through each window's unique angle we can find out more, and add up the clues, but we can never be totally sure of what we are seeing, and must always leave open the possibility that there may be some aspect we cannot see, some mystery as to why this equalizer sounds ''good" and this other one sounds "bad." For example, here are a couple of "objective" measure me tits thai just don'i add up! What Makes it Sound Bright? I've discovered a digital filter that measures "dull" but sounds bright! The TC Electronic System 6000 lets the user choose between different low- '97 pass filters for the A/D and D/A converters. Some of the filters roll off significantly above 16 kHz (at 4,4,.! kHz sampling), so you'd think they would sound dull. But instead, to my ears, the 16 kHz filters called Natural and Linear sound more open and clear than the particular 30 kHz filter called Vintage. However, there are other converters whose filters extend to 20 kHz and which sound even more open than the TC's Linear filter. So measured bandwidth cannot tell the whole psychoacoustic story. We look into the audible effects of filtering in Chapter v8. The Fallacy of Typical Weighting Curves We have equipment in our studio whose noise floor measures as low as -i^o dBFS U> as high as -50 dBFS (after A/D conversion). However, much of this equipment is perceptually quiet: if I have to put my ear up to the loudspeaker to hear the hiss, then I consider it insignificant. Interestingly, the weighting methods1 by which converter manufacturers commonly measure noise bear little relationship to human perception. One particular converter whose A-weighted noise floor is—108 dBFS sounds significantly quieter than another converter whose A-weighted noise floor is -115 dBFS! The reason is that the often-cited, A-weighted curve does not adequately consider the ear's greater sensitivity in critical bands. It turns out that the converter wrhich measures better (A- Weighted) produces significantly more energy circa 3 kHz, where the ear is most sensitive, and the "Never turn your back on digital." —Bob Luwtg. A-weighting filter does not take into account the significance of this critical band. To be psychoa-coustically accurate, noise measurement standards should adopt a curve closer to the measured noise boor of the human ear, such as the eft1 order curve used by some of the best-sounding dithers (see Chapter 4). This curve is called "F" weighting.J There are many other areas in which traditional measurements do not correlate with what our ears tell us, particularly in the evaluation of low bit rate coding systems. These systems measure quite well with standard techniques, but once the ear has been trained to hear their errors, we can easily identity artifacts we've never heard before with analog technology: described by some as chirping, or space monkeys, bet's see if we can objectively find out why-some analog and digital processors sound better than others. Just remember that measurements look at an object through a few narrow windows, and there maybe a different, or better, explanation for sound quality than what Fve come up with. II. Measurement Tools We Can Use While Mastering FFT Measurements FFT stands hx Fast Fourier Transform, To really learn howto interpret (and not misinterpret) an FIT requires a college-level engineering course, and although I cannot claim to be such an expert, I have learned just enough to be dangerous! High-resolution FFT analysers, such as SpectraFoo™, are very reasonably priced, thanks to the exponential increase in CPU power and they provide an essential early warning system, a protection from the Chapter 16 19S vicissitudes (bugs) of digital audio. Never turn your back on digital, says Bob Ludwig, or as I say,you're only one mouse click away from disaster'. It's a whole new world based on software designed by fallible human beings. FFT for Music Figure C16-01 in the Color Plate section shows SpectraFoo in action during a CD mastering session. At the middle top is a bitseope, currently showing 16 (and only 16) active bits, an indication that the dither generator is probably doing its job. This bitseope can reveal if some digital device is malfunctioning, since one of the symptoms of a disfunctional processor is to toggle unwanted bits, or hold some bits steady when there is no signal. Bitscopes can also show if there are any unwanted truncations caused by defective or misused processors. However, the bitseope is only one of the small windows we can look through; it can easily miss problems, or seem to indicate problems which require further interpretation. For example, some equalizers produce idle noise when the music goes to silence. This can he perfectly normal, but will show up on the bitseope as activity. Toggling the equalizer in and out while observing the bitseope will ascertain if that is the source of the problem or some otheranomaly in I he signal chain. At top right is a stereo position indicator, which is frozen at a moment when the information is slightly right-heavy. At left is a meter that conforms to the K-14 standard (see Chapter 15). The meter shows the hottest moment of a rather hot R&B piece (which I would have preferred to reduce, but the client desired it this hot!). For the record, this material was monitored at - 8 dB, which really makes it K-12 material. Just below the bitseope is a correlation indicator, revealing that the material is significantly monophonie. I prefer a correlation indicator to an oscilloscope; meter deflections closer to the center of the scale indicate less correlation from channel to channel and likely a larger or more spacious stereo image. However, 1 always use my ears to confirm the image is not loo "vague" and perform a mono (folddown) test to make sure the sound is mono-compatible. At mid-screen is the spectragrarn, showing spectral intensity over time. This can be useful to identify the frequencies of problem notes, or simply 10 entertain visitors! At bottom is the spectrograph, whose general rolloff shape gives a vague idea of the program's timbre (though most times I disregard the spectral displays, since the eye candy of the visual display distracts our aural senses). Figure C16-02 in the Color Plates shows SpectraFoo during a pause in the music, with only the bottom four bits toggling, confirming that the dither is working correctly, since dithers which use heavy noise-shaping exercise several bits. Note that the bitseope shows four bits toggling (since dither is random, in this snapshot, bit 15 is at zero) and that the spectrograph shows the curve of the dither noise, which can be identified by its shape as POW-R type 3 or a similar oft order curve. Usingthis analyzer, you can often determine the type of dither used by the mastering engineer on recorded CDs. 199 Analog and Digital Processing The level meters had not decayed fully when this shot was taken. The correlation meter fluctuates very slightly near the meter's center, showing that the dither is uncorrected between channels (random phase). I always glance at this display at the beginning and end of the program, to make sure no bugs or patching errors have crept in. I carry a SpectraFoo umbrella even if it's not raining! II. Measurement Tools to Analyze your Equipment Let's sort out what happens beneath the knobs. As in geometry, the shortest distance between two points is a straight line, so too in audio — both digital and analog —the cleanest signal path contains the fewest components. The converter used to be the most degrading piece in the studio, but although they have greatly improved in recent years, we should still avoid extra conversion whenever possible. For analog tapes, it's best to do all the analog processing on the way to the first and only A/D conversion. But these days mixes are often on digital tt;pc, and as there are a lot of desirable analog processors which the mastering engineer may prefer because they sound more organic than their digital equivalents, the tonal benefits of analog processing might ouvweigh the transparency losses of an extra conversion." The best defense is a good offense, and it is possible to reliably measure signal helow the noise with an FFT analyzer. An FFT can confirm if a digital processor is not truly bypassed when it says irrpass, which can be pretty deleterious (see Chapter 4). Jitter (see Chapter 19) is irrelevant to FFf analysers, which strictly look at data. Even though the analyzer can only examine 24 bits (the limitation of the AES/EBU interface), it can measure distortion 40 dB below the 34-bit noise floor! This is because Spectrafoo is a 64-bit floating point system. So we can compare the distortion of processors which truncate at the 34th bit versus others which use 48 bits or so internally and then dither up to 34 bits. Whether we can hear these differences is a different question. Psychoacouslician J, Robert Stuart has demonstrated that we can hear a 34-bit truncation in an 18-bit system. The ear's dynamic range is approximately 30 bits (130 dB). hut this varies with frequency. At certain frequencies we can even hear below o dB SPh! How Many Bits is Enough? In color plate Figure C16-03, we compare 16, 20. and 34-bit flat-dithered noise.1 The levels of all (he "bins" add up, so at 16 hits, ibe curve which looks like it rides at approximately -134 dBFS (level of individual bins) totals to an RMS level of about -91.3 dBFS RMS, the theoretical limit of a properly-dithered 16-bit system. But discrete signals at some frequencies can be heard as low as -115 dBFS in a property-dithered 16-bit system, helow which they are buried in the noise. Psychoacoustically, for the vast majority of popular and classical music, t6 bits properly done are just enough to do the job right. But as soon as we post-produce, copy, process and change gain, we accumulate noise and need professional headroom, or perhaps we should call it footworn'' since the top, at o dBFS, is a constant. Psycltoacousticians studying the limits of the human ear have determined that go-hits is enough ' And iosaes ca n be minimized using[ipsaoiptiiig(see Chapter i). t This is a made-up word, riot an official Term! Chapter 16 300 for good A/D and D/A performance. Anything more is just gravy, and it's very rare to find a "34-bit" converter with better than 18-20-bit noise level. For processing, however we need the additional jbotroom, better than 24 bits, because the frequency-content of digital distortion is far more annoying to the ear than analog distortions which are much louder. This is because distortion created during digital processingyields harmonic components which beat against the sample rate, producing dissonant inharmonic beat or intermodulation products. For purist processing, wc may need as much as 48 to 72 bits, especially for extreme gain changes, complex filtering, compression, or to avoid cumulative distortion when cascading processes. It's a myth that there's no generation loss in digital processing; little by little, bit by precious bit, sound suffers with every DSP operation. figure C16-04 in the color plates shows the noise floor of a popular dither called POW-Rtype 3 at 16-fcit (red trace). For reference, we show the noise of flat 20-bit dither (orange), and 24-bit dither (green). POW-R's shape is designed to maximize performance by keeping the noise at or near the ear's low-level sensitivity at various frequencies. POW-R dither reaches 20-bit performance in the critical upper midrange (circa 3.5 kHz) where the ear is most sensitive. Thus, much of the low level ambience and reverberation that would have been masked is revealed, even with 16 -bit reproduction. This performance can only be achieved by recording at a longer wordlengrh to begin with, as noise accumulates and the SNR gets slightly worse when you add final dither to the processed source. Analog versus Digital Processing Cheap versus Good... Is It Really Accurate? Many people have argued that the reason we notice harshness in some digital recordings is that digital audio recording is moie accurate than analog. Their claim is that the accuracy of digital recordi ng reveals the harshness in our sources, since digital recording doesn't compress (mellow out) high frequencies as does low speed (15 IPS) analog tape. Accuracy, they say, is why we have regressed to tube and vintage microphones. But I say this is only a balf-truth, since most of these arguments come from individuals who have not been exposed to the sound of good digital recording equipment, which is not only accurate, hut can even he warm and pretty. Cheap digital equipment is subject to edgy sounding distortion which can be caused by sharp filters, low sample rates, poor conversion technology, low resolution (short wordlength), poor analog stages, jitter, improper dither, clockleakage in analog stages due to bad circuit board design and many others, such as placing sensitive A/D and D/A converters inside the same chassis with motors and spinning he^ds. It takes a superior power supply and shielding design to make an integrated digital tape recorder that sounds good: compare the sound of an inexpensive modular digital multitrack (MDM) with the Nagra Digital recorder— 4 very expensive tracks versus 8 cheap ones. 'wlien it comes to processing, numeric precision is also expensive, even though it's all software. Numeric imprecision in digital consoles MYTH: It's a digital processor, so there's no generation loss. \'.0< Analog and Digital Processing MyTH: It's a Digital Console. It must be better than my old analog model! produces problems somewhat like noise in noise in analog consoles, hut there is an important difference: noise in analog consoles gradually and gently obscures ambience and low-level material and usually does not acid distortion at low levels. However, numeric imprecision in digital consoles causes quantization errors (which increase at low levels) destroying the body and purity of an entire mix, creating edgy, colder, sound, which audiophiles call digit!tis. Since digital consoles do not make sound warmer, depending on the quality of their digital processing—and the number of passes through that circuitry—it might he better to mix through a high -quality analog console. Even though good digital equipment is getting cheaper at an exponential rate, it is still expensive to produce excellence in digital recordings. That's why analog tape and analog mixing re main very much alive at this point in the 21st century. Two Fine Equalizers, One Analog, One Digital Inmy opinion, much inexpensive tuhe equipment is overly warm, noisy, unclear and undefined, and the common use of "fuzzy" analog equipment to cover up the problems of inexpensive digital equipment is a band-aid, not a cure for the loss of resolution. Not many people have been exposed to recent audiophile-quality tube equipment, and only the best-designed tube equipment has quiet, clear sound, tight (defined bass), is transparent and dimensional, yet still warm. Audiophiles feel a we 11-designed tube circuit can he more linear and resolving4 than a low-cost sol id state circuit. I certainly feel 1 hear more through some amplifiers than others. Modern-day tube designers often make innovative use of low-noise regulated power supplies on filaments and cathodes, a practice which was impractical in the 50's. Figure CJ6-0S in the Color Plates section shows the low distortion and noise performance of a well-designed, popular state-of-the art analog tube equalizer, the Millennia NSEQ-3 (red trace). For reference, 20- and 34- bit noise arc shown in blue and green, respectively. Notice that the tube noise of the NSEQ is about lq dB greater than 30-bit, making it •dvirtual! 8-bit analog equalizer. However, this performance is dependent on the analog gain structure used. If you drive the equalizer harder, its noise floor will lie lower compared to maximum signal, and distortion may or may not he a problem. Since the Millennia's clipping level is around +3? dBu, it may be perfectly legitimate to drive it with nominal levels of +10 dBu or even higher, provider! the source equipment doesn't overload! Yet even with nominal levels of o dBu as was used for this graph, this tube equalizer is extremely quiet. Its noise is inaudible at any reasonable monitor gain unless you put your ears up to the speaker, 'Audio processing is the art of balancing subjective enhancement against objective degradation." - Bob Ot.hsson. Chapter 16 demonstrating that noise-floor is probably the least ol our worries. 1/3" 3o IPS 3-track ana Log tape has even higher noise, hut no one complains about it for popular music. For this FFF, we set up a D/A converter, feeding theJMSFQ and then an A/Dand the FIT. A digitally generated i kHz -6 dBFS 34-bit dithered sine wave feeds the D/A. We adjust converter gain so o dBFS is +i8dBu, and boost the equalizer about 6 dB, till just below A/D clipping. The equalizer is coasting at this level, since it's around 19 dB below its clip level! If you ;ire looking for extreme "tubey" effects, you can drive the equalizer even harder, and also realize a greater SNR, provided the converters can handle the hotter level, certainly the equalizer can. Notice that the equalizer's distortion is dominated by second, third, and fourth harmonics, which tend to sweeten sound. For comparison, in yellow is the performance of the superb Z Systems digital equalizer, dithered to 24 bits, boosting 1 kHz 5.8 dB with a Q of 0.7. Its harmonic distortion performance is texthook-perfect (no visible harmonics on the FFD. Some engineers use the word "dry" to describe the sound of a component that has little orno distortion. Looking through other "windows" we find that harmonics are far from the only sonic differences between these pieces of gear. Tubes, power supplies and transformers can loosen the bass, which can sometimes be desirable; the digital equalizer retains the tightness of the bass; the digital and analog equalizer's curves yre also different, though the ZQ a does a nice job of simulating the shapes of gemle * Siuce digital etfuajiz.era dun t soften the bass like sonietubc units, ytm mayvish tc "loosen" the bass virh compression or some other tool- ana log filters. Equalizer curve shape and phase shift probably make up other areas of delicate sonic difference between models of equalizers. The premium price of both the ZQ-s and the NSEQ reinforce my point that high-quality analog or digital recording is expensive. At the time of this writing, it will be a number of years before there's enough power in a typical computer plug-in to come up to the quality of the best outboard processors. "Nasty" Digital Processors Truncation distortion can be fairly "nasty," For example, in Figure C16-06 of the Color Plates section, we compare the analog M ill en ma NSEQ (orange trace) versus the digital 'L Systems set to truncate at 20 bits, no dither (black trace). Don't try this at home! 1 think there are better ways to addgrunge than turning off the didier. Much of the ambience, space, and warmth of the original source have been truncated, lost forever, converted to low level grunge (severe inharmonic distortion and noise). Even a small amount of non-harmonic distortion can be bothersome. Which sounds better, an analog processor with a smooth but higher noise floor, plus second and third harmonic distortion, or an undithered digital processor with a lower average noise floor plus inharmonic distortion? Poorly-implemented digital compressors produce severe inharmonic distortion, which is without integer relationship to the fundamental. FigureCI6-0Tin the Color Plates compares two digital compressors, both into 5 dB of compression with a 10 kHz signal. 20'i Aria log jmd Digital Processi In orange is a single-precision, non-over-sampling compressor, and in black a double -sampling compressor implemented in 4,0-bit floatingpoint. Note the single-precision compressor produces many non-harmonic aliases of the 10 kHz signal, especially in the critical midband. Nasty sounding first-generation compressors are still common in low-cost digital consoles and DAW plugins. It takes a lot of processing power to double-sample. I'm convinced that the proliferation and misuse of cheap digital processing has degraded the sound quality of much recently-recorded music. The Magic of Analog? Static distortion measurements don't explain every reason why some compressors sound excellent and others hurt your ears. There are analog processors which are so magical that though they are not transparent, they add an interesting and exciting sonic character to music, or to put it another way, their subjective cure is belter than their objective disease. Analog tape recording is a perfect example of this type of process; measured objectively it's noisy and distorted, but subjectively it can kick ass! If psychoaeoustic research had been a bit more advanced on the audible effects of masking distortion and noise, then perhaps we may not have pursued this expensive search for 144 dB extremes. For example, the noise floor of the Sony-Philips DSD system is not particularly special (about 120 dB in the audible band), but it sounds excellent, indicating that low-noise must not be our only goal. "We may even conclude that part of the good sound is due to masking; maybe -730 dB is just enough to cover the ugly pans of the distortion of even some of our best analog and digital gear. In addition, noise-free recording media can be very sterile sounding hecause all the nits and cracks and distortions caused by the musicians and their amplifiers are completely revealed by the quiet media. So. sometimes, adding extra noise can be more beneficial to the music than working noise-free. Perhaps one of the many reasons why ana log tape sounds more musical to many people... noise can be very euphonic. We should certainly experiment with noise-masking and make our decisions on what is best for the music. [Please see sidebar, Clarity or Fuzz.] 1 think that many classic analog compressors' warm, fat yet clear sound signatures come from a unique combination of attack and release characteristics, which may be emulated in a digital processor. There are some plug-ins which emulate classical analog compressors but to my ears they do not come up to the job; 1 think they will get better overtime when the cost of DSP goes down. Currently, plug-in designers are forced to minimize the DSP load of their processors or users complain they can't fit a plug-in on every channel strip (as if this is desirable). Certainly the Weiss digital compressor does not sound digital, so we know that it can be done with programming skill and expensive DSP. An Analog Simulator-Pick your flavor of grunge Figure C16-0S in the Color Plates compares the NSEQ to the CranesongHEDD-192, a digital analog simulator of excellent sound quality. The Cranesong (blue trace) has been adjusted to produce a remarkably similar harmonic structure to the NSEQ. For this graph, its levels have been Chapter 16 204 purposely set to produce more distortion than the Millennia was producing. Amazingly, the ear thinks it's hearing an excellent analog processor without any imaging or resolution loss. But the low-level grunge at the bottom of the picture looks mighty suspicious; looking through thia "window" you might think the Crane song was truncating important information. But two important factors ameliorate: First, the Cranesong's grunge is about 12 dB lower than that of a truncated device and thus is likely masked by the noise and the euphonic harmonics. Secondly, the HEDD has a unique summing internal architecture that does not alter, truncate or recalculate the original source signal. The Cranesong clones the original source and sends that to its output, while mixing in the calculated distortion, thereby largely preserving the ambience and space of the original. The low level distortion in the figure is part of the additive distortion signal and not a result of recalculations to the source. In other words, only the distortion is distorted! We t00kthi8 measurement first at 44.1 kHz; at 88.5 and 96 k As you can see in the two figures on the next page, at 96 k the low level grunge is virtually gone, and the Cranesong's distortion is even cleaner, if that's not a contradiction in terms! Cooking Better Sound—Natural\y There are certain analog consoles whose character is highly prized because they add spice, dimension and even punch to a mix. One name that comes to mind is API, which to my ears has an excellent combination of desirable linearities (like headroom and bandwidth) and nonlinearities. I think the subtle "grit" in their discrete opamps could even be slight intermodulation distortion, which does just the right thing for rock and roll yet is subtle enough for jazz and classical depending on how you drive the stages (a matter of taste). I think the transformers add some punch or fattening via saturation and 2Ilt' and 3ra harmonic distortion as well as some upper harmonics and a touch of phase shift (which could add some dimensionality). Our role as mastering engineer is like that of the master chef who knows just how much and what kind of spice is useful to add pizzazz without overcooking or spoiling the flavor. By the middle of our careers we have collected a sizable analog and digital spice rack! The Cranesong can mimic rdi three types of naturally-occurring analog distortion, called Triode, Pentode and Tape. The triode control adds a pinch of salt, pure second harmonic, which, being the octave, is quite subtle, almost inaudible with some music. It can clear up the low end by adding some definition to a bass, but it can also thin out the sound too much. The pentode is extremely versatile; it provides both Clarity or Fun, which is best? There's nothing wrong with using fuzz if it produces the right esthetic result. With high-resolution digital recording, tube equbment can add a nice flavor. Or, it can be used as □ useful cover-us, a fuzzy band-aid. A client once told me, "Bob, your mastering is so much clearer than the mix, I'm starting to hear all the mistakes!" yes, high-resolution processing revealed more and more of the source, but this came at a price, all the warts were revealed. I solved the problem by fuzzing up the sound slightly with some delicate tap; style harmonics. For if the performance is not the absolute best, or the mix is not wonderful, orthesound is just better when it's not perfectly clear-then fatness, masking fuzz, or analog distortion magic may be just the right approach for the music. In mastering I usually prefer to accomplish this by first passing the signal through the highest resolution electronics, which add little or no distortion, and then add a touch of the fuzzy sauce with a selectively fuzzy component or a noisy dither. This approach is methodical, controllable, and reversible. Clearly, artful use of noise can mask and therefore ameliorate some low-level distortions. Ironically, digital recording's super low-noise may be its greatest enemy. 305 Analog and Digital Processing Comparing Cranesong HEDO 192 in Pentode made at two different sample rates with a 10 kHz-15 if SB test tone. it tap, U. 1 kti7 SR. at bnttom. U kHz. Note the different frequency scales since the higher sample rate displays harmonic frequencies of the audio signal up ta 41 kHz. salt and pepper. At lower levels it adds third and iil'th harmonics, which are dangerously seductive, producing a unique presence boost and brightness with little grunge or digititis, especially at 96 kHz SR (pictured). At higher levels, additional odd harmonics add grit and some fatness, like an overdriven pentode tube—a Marshal amplifier in a 1 U rack-mount box! Past the fifth, subtle amounts of seventh and ninth harmonics add a sometimes desirable "edge." The Cranesong's tape control is the sugar, which when mixed in, can sweeten the pentode pepper, yielding flavors from red to yellow, green or Jalapeno! The celebrated third harmonic (an octave plus a fifth) sweetens and fattens the sound, much like analog tape. Tape also produces the fat sound of analog tape, which helps to "glue" a mix together. Tape can help digitally-mixed sources that may be well-recorded but miss some of that "rock and roll fatness." The control produces largely second and third harmonic distortion, but as it's advanced, some additional higher harmonics, emulating analog tape performance. Too much sugar gives slow, muddy molasses, a rarely desirable quantity, but available if you need it. But just a light amount can act as a sweet -■ sound ing bandaid to ameliorate truncated or edgy recordings. Regardless, space and depth have been permanently lost if there was truncation prior to the use of the Cranesong. No one is sure why, but critical listeners have observed that adding delicate amounts of harmonic distortion in just the right proportion appear to enhance the depth and clarity in a recording. The trick is to know the exact amount.3 Single Precision, Double Precision, or Floating Point? First-generation digital processors gave digital processing a bad name. But single precision 24-bit processors are going the way of the Dodo, at least in respectable audio equipment. All things heing equal I fiuu^i. J Ji^iiil] Domain's K Ml llli fj rocesR 'Hi:'-. .1 liviIl i;l:i1iJ l:I ]V-:m :i ill That lost ambience. Chapter 16 (and they never are) 3g-bit floating point processors are generally regarded as inferior-sounding to 48-bit (double-precision fixed), and 4,0-bit float. Some newer floating-point devices, such as the software program ChannelStrip by Metric Halo, work in 64-bit and have impressively low measured distortion. However, one designer, Z-Systems, has produced a 32-bit floatingpoint digital equalizer using proprietary distortion-reducing techniques that sounds very good and measures as well as some other equalizers using longer wordlengths. Ultimately the skill of the designer determines how nice the device sounds. The mathematics involved are not trivial, and the designer's choice of filter coefficients can make as much difference as his choice of wordlength. Figure C16-09 in the Color Plates shows that with a single precision processor, even a simple gain boost can ruin your digital day. A dithered 24-bit 1 kHz tone at -11 dBFS is passed through two types of processors, each boosting gain by 10 dB. The distortion of the single precision processor (red trace) is the result of truncation of products below the 34th bit. Nevertheless, the highest distortion product, at -143 dBFS, is extremely low. 1 believe the sound of a single 34-hit truncation may not be audible, but cumulative truncation adds enough inharmonic distortion to become annoying to the sensitive ear. In blue we compare the perfectly clean output of a 40-bit floating point processor which dithers its output to 34 bits. I measured similar performance with a 48-bit (double precision) processor and 32-bit floatingpoint processor, which both dither to 34 bits. Double Sampling? The most advanced digital equalizers and dynamics processors use double sampling technology, which means that the internal sampling rate is doubled to reduce aliasing distortion. High-quality linear phase filters are used in the internal sample rate converters. I'm not certain this has audible meaning for equalizers,6 but dynamics processors benefit because non-linear processing generates severe aliases of the sampling rate, and the higher the sample rate, the less aliasing. Figure CU-1D in the Color Plates compares two excellent-sounding digital dynamics processors, the oversampling Weiss DS1-MK2, which uses 40-bit floatingpoint calculations, and the standard-sampling Waves L2, which uses 48-bit fixed point. To compare apples to apples, both processors are limiting by 3 dB, with the Waves in red, and the Weiss in green, set to iooo.-i ratio. Note the oversampling processor exhibits considerably lower quantization distortion. However, the swdtchable safety limiter of the Weiss, which is not oversamplcd, produces considerable alias distortion even at ] dB limiting (orange trace). At 88.2 kHz and above (not shown), the Weiss safety limiter and the Waves perform measurably better, and double sampling may not be needed. Thus there is considerable advantage of doing all our processing at higher rates, which moves the distortion products into the inaudible spectrum above 20 kHz. Then, sample rate convert to 44.1 kHz during the last step, which filters out most of the high-frequency byproducts. Despite the measured differences, the "window" we've chosen, (steady-state smewave performance) probably has little to do with the perceived performance of these two excellent -sounding limit era. Because steady slate measurements have little or no relationship to audible performance of limiters. I believe the key to the ear's reaction is the duration of the limiting action. In typical use, limiters go into gain reduction for a very short time. At limiting ratios of 1000:1, with instantaneous attack, and fast release, these processors produce only momentary distortion, shorter than the human car's sensitivity to distortion (about 6 ms according to some authorities). But if a user overpushes a limiter so that it is working on the BMS levels of the material as well as the peaks, then its sinewave-measured distortion becomes audibly significant. Compressors, however, are different animals, and double sampling is critical for them, because a compressor may be into gain reduction for a good percentage of the time. I feel that double-sampling contributes to the Weiss's robust and warm sound when used as a compressor. While Heavy Metal recordings employ considerable distortion for effect, classically they employ analog processors for this purpose to avoid the inharmonic aliases of typical digital processors. Better Measurement Methods? It should be clear by now that we can easily measure simple phenomena that are probably too subtle to hear (such as single tone harmonic distortion near the 34 bit level). But we can hear (perceive) very complex phenomena that are difficult to describe with measurements (such as the sound quality of one equalizer versus another). What we will need to better describe such complex audible phenomena arepsyviwacowstically-based measurement instruments that have not yet been invented. Current research and development of coded audio such as MP3 (that benefits from the ear's masking) could lead to better noise and distortion analysers that can discriminate between distortion we can and cannot hear. The Bonger—A Listening Test Since current steady-state sine-wave measurements are misleading when measuring nonlinear processors like compressors, a more effective measurement method is by listening: using the gongeraka bonger, originally developed by the BBC's Chris Travis and available on a test CD from Checkpoint Audio (see Appendix 10), This test is a pure sine wave that modulates through various amplitudes, in order to exercise and reveal any amplitude non-linearities in the signal path. Just play the bonger through the device under test and listen to the output for noise modulation, buzz or distortion. Identity Testing—Bit Transparency Any workstation that cannot make a perfect clone should be junked. The simplest test is the identity test, or bit-transparency test. Set a digital equalizer to flat and unity gain, then test to see if it passes signal identical to its input. Some people scoff at :his test, since analog equipment almost never produces identical output. But the test is Chapter 16 208 important, since digital equipment can produce egregious distortion as we have seen. The bit scope can aid in null testing: it is quite likely that a device is bit-transparent if you selectively put in 16 bits, then 20, then 24, and get out the same as you put in. You can also watch a 16 or ga-bii source expand to ^4-bits when the gain changes, during crossi'ades, cr it' any equalizer is changed from the o dB position, A neutral console path is a good indication of data integrity in a DAW. After the bitseope, your next defense is to perform some basic tests, for linearity, for distortion with the FFT, and finally, test for perfect clones (perfect digital copies). The null test confirms bit-for bit identity: Play two files at the same time, invertingthe polarity of one and mixing the two together. There must be zero output or the two files are not identical. Since designers are fallible human beings, you should carry out basic-tests on your DAW for each software revision. Choose your Weapon So. which 10 use, ana log or digital processing? A few years ago, I didn't like the sound of cumulative digital processing. I could tolerate a couple of the best-designed single-precision units in series. After that, it was back to analog. If processing digitally, be aware of the weaknesses of the equipment. Until manufacturers adopt more powerful processors, and processing power catches up, limit the number of passes through any digital system. h'ach pass will sound a little bit colder even using 24 bit storage, A mix made through a current-day digital console may nr may not sound better than one made through a high-quality analog console, depending on several factors: the number of passes or bounces that have been made, the number of tracks which are. mixed, the quality of the converters which were used, the outboard equipment, and the internal mixing and equalization algorithms in the digital console. While no console equalizer currently has the power of a $6000 Weiss, economically it's a lot simpler to replicate a good equalization algorithm for 144 channels than performing the equivalent in analog hardware, so there is hope for the digital console's future, when silicon will be cheaper. 'The Source Quality Rule: Always start out with the highest resolution source and maintain that resolution for as long as possible into the processing." And there's no turning back; 24-bit recording and high sample rates are taking over, and they sound better, 30 for mastering we can choose from the best of several worlds, and we make our choices by balancing the benefits and the losses: ■ (some) very transparent, low-noise, pure-sounding digital gear • (some) good-sounding, reasonably-transparent, low-noise analog gear that we can use to add a little sugar, salt, pepper, or spice, or simply to prevent the sound from getting colder • a digital processor that simulates analog distortion or warmth. Why Is Good DSP So Expensive? Intellectual property is the most nebulous thing to a consumer. It's easy to see why a two-ton 209 Analog and Digital Processing Mercedes Benz costs so much, but the amount of intellectual work that has gone into a one-gram IC is not so obvious. It can take five man-years to produce good audio software, created by individuals with ten or more years of schooling or experience, Similarly, when the doctor takes ten minutes to examine you, prescribes a 10-eent pill and then presents you with a $100 invoice, remember you're paying for all that knowledge and experience. This doesn't mean I'm against socialized medicine, I just want to re-emphasize the reasons why intellectual property and good DSP are so expensive. The Source-Quality Rule An important corollary of this discussion is the source-quality rule: Source recordings and masters should have higher resolution than the eventual release medium. Always start out with the highest resolution source and maintain that resolution for as long as possible into the processing. When mastering, one consequence of this rule is to reduce the number of generations and copies, and if possible, go back one or more generations when a new process must be added or applied. This rule even applies when you're making an MP3 or other data-reduced final result. Consider a lossy medium like the (rapidly obsolescing) analog cassette. Dub to cassette from a high quality source, like a CD, and it sounds much better than a copy from an inferior source, like the FM radio, by avoiding cumulative bandwidth losses, as wider bandwidth sounds better. In other words, the higher the audio quality you begin with, the better the final product, whether it's an audiophile CD, a multi- media CD-ROM. MP3, or a talking Barbie doll. It may seem funny, but you'll never go wrong starting at 96 kHz/34 mt if the product is to endup on 44.1 k/16 bit CD. Sample rate conversion should be the penultimate process, followed by dithering. In Summary Mastering engineers do not have to think about the meaning of life every time they perform their magic; many engineers simply plug in their processors, listen, and make music sound better. But I also like to consider just why things sound better, because it helps me avoid problems that are not obvious at first listen, and also dream up innovative solutions. 1 hope that this chapter has inspired you to dream up some innovations of your own! i See the Appendix for references on ncise filters. Ironically, all the standard noise-weighting filters should be revised, because they have no relationship with human perception of very quiet devices such as A'D and DA conveners. 'i And even then, the F curve is an approximation, since the car's perception of noise is much more than just a frequency response curve, as Jim Johnston evpl-iins- Maim should hp measured scparalely in each critical bani" and compared to the car's threshold forllm critical hand. 3 Most of the Spectra Foo'1'* screenshots were lake tl at an FfT resolution of jv.k points Oaooo "bins") with about 4 seoond average time and Manning weighting. The actual amplitude of details onan FFT depends on its resolution, so FFTs arc only dtrectiv comparable if the same methods .ire used 4 The term resuh'i rig. when applied to the sound of tube circuits, is itself an unquanlifiable audiophile subjective term. It's fairlo say that audbphile negative reactions to some ugly-snunding solid-state ci rcuits use inexact terms such as rtiolution and transparency, which may he proved to be simply ditttri-bution of aarmonics or differences in frequency response. And maybe not! 5 Forthc curious. K-Sterco and K Surround do not use harmonic distortion to enhance depth. They use other psychoacoustie principles. 6 Although the makers of the double-sampling Weiss Equalizer, CMl.plugin, and the Audiocube feel that double sampling is important for equalizers. Some engineers like the sound of high frequency Curves that extend beyond 20 kHz. even if that is later cut off when the sample rate is halved at the out put of the equalizer. And Jim Johnston (in correspondence) states that when a digital filter has response extending to half the sampling rate, it can produce some really odd and unexpected frequency responses, indicating that double sampling-is important for such type of equalizers. Chapter 16 310