pcm dsd

You are reading the older HTML site

Positive Feedback ISSUE 66
march/april 2013

How to Make PCM Sound Like DSD
by Lynn Olson

This article is aimed at DAC designers, and is based on conversations over at

http://www.whatsbestforum.com/showthread.php?9598-DSD-comparison-to-PCM/page52

The proposal is called Big Ultrasonic Dither, or BUD for short. It's pretty simple: the incoming PCM modulation is reduced by -6dB (for ladder/R2R converters), or -12dB (for delta-sigma converters). Independent (uncorrelated) left and right dither is added at the -6dB level at the highest possible frequency, approximately an octave wide, with very steep attenuation skirts on either side. It's essentially triangular dither, but with much steeper slopes and a (much) higher modulation level.

If the converter is running at 16fs, or 705.6kHz, the center of the dither frequency should be half that, or 352.8kHz. The higher the converter speed and associated dither frequency, the better the technique works.

What does this technique do for a PCM-based DAC?

1. If the converter is a ladder/R2R converter, all of the bits are toggled continuously, linearizing the entire resistor array. The problems with the 1000 0000 to 0111 1111 transition are minimized, since the effective dynamic range of all signals is no more than 0dB to -6dB. Yes, one bit of dynamic range has been discarded, but this isn't a problem for 20 and 24-bit converters, and all the other bits have been linearized.

2. If the converter is the more common delta-sigma (or sigma-delta) type, the entire dynamic range falls between -6dB and -9dB. The noise-shaping algorithm is now continuously busy, and will not fall into instabilities that occur at signal levels below -20 to -30dB. The incoming digital signal has had its level reduced by -12dB to prevent overshoots and clipping in the noise-shaping section of the converter. As before, with a 24-bit converter, there is dynamic range to spare, and losing two bits is not significant to the overall performance.

3. The analog section now sees ultrasonic spectra that look like DSD. This is called a spread spectrum, and is used in modern Class D amplifiers to improve performance. By randomizing the far-ultrasonic content, any slewing in the following opamps or slow transistor amplifier/filter/buffer sections is now randomized. By contrast, classical PCM has what's called a comb spectrum in the far ultrasonic range, and this creates a large number of IM terms in the audio band. The DSD-like spectra, by contrast, creates a uniform noise floor that is constant with incoming signal level.

DAC designers are now probably throwing up their hands in horror. Nooo! Lynn, you've combined the worst features of DSD and PCM! All the ultrasonic noise of DSD, all of the transient and overshoot defects of PCM left intact, and you've thrown away 1 to 2 bits of the converter's dynamic range, too! Why?

Well, I'm pretty sure (that's not 100% sure, but it's close), that most of the analog-like smoothness of DSD is the effect of the ultrasonic noise spectra on the first and second analog stage. The vast majority of DACs, even including the $10,000 price point, use analog electronics that are too slow by a factor of 100x. I'm not making this up. The content coming from the switch array in the converter goes out to 20 MHz or higher, and this is far higher than popular opamps like the 5532/5534 and 797 can handle. They have slew rates in the 13V/uSec to 20V/uSec range, and what's required to avoid slewing is 600V/uSec to 2000V/uSec. This isn't really the fault of the opamp; it's actually very difficult to design circuitry that is linear in the MHz range.

The different ultrasonic spectra of PCM and DSD have very different effects on slew-prone analog electronics. As mentioned above, the ultrasonic DSD spectra is deliberated created noise—dither is the technical term – with a carefully shaped spectra, peaking at a very high frequency, and steeply attenuated below that. The DSD transmission system requires substantial amounts of ultrasonic dither at both the encoding and decoding end, along with high-order noise shaping to linearize the converter (and move the noise out of the audio band into the ultrasonic). This dither is statistically independent for Left and Right channels (if it isn't, the listener will perceive a noticeable "monoing" effect as the music is faded down).

The classical PCM spectra has a very different effect on slew-prone analog electronics. The extremely fast rise-times – in the nanoseconds – slews the analog electronics at the rising and falling edge of every sample, and the rise time of the switch array in the converter is pretty much the same whether the converter is operating at 44.1kHz or a much higher frequency like 705.6kHz. The slew events are so short that it doesn't appear on FFT-based distortion measurements, since the FFT measurement is averaged over a second or longer; obviously, a few nanoseconds occupy only a very tiny portion of a second.

But the slew events have consequences; the ideal Nyquist reconstruction by low-passing the signal doesn't happen, since the sample duration has been affected by the slewing, which is not the same as linear low-pass filtering. The departure from ideal Nyquist reconstruction becomes greater with increased high-frequency content in the incoming signal, since the transitions occur more frequently and have an overall greater height. (To illustrate, the size of the step transitions for a 100 Hz signal are much smaller on a per-sample basis than a 10kHz signal.)

The effect of the PCM slewing on the slow analog stage is to create a type of dynamic noise floor, filled with narrow spectral lines that follow the audio signal in a complex way. Since the narrow spectral lines are not harmonically related to the audio signal, the spectral lines are perceived as a gritty type of noise modulation that follows the audio signal. DSD, by contrast, randomizes the slewing, spreading it uniformly over the entire audio band, and the slewing is constant, instead of following the signal level, like PCM.

There are ways to design the analog section so it avoids slewing, but this is neither cheap nor easy. Nearly all consumer products, and the majority of high-end DACs and CD/SACD players, use audio-grade opamps in the signal path. A few expensive products use all-discrete transistor electronics, but slew rates in excess of 600v/uSec are very rare—the manufacturer usually brags about it when they do, since combining low distortion and high slew rate is very, very difficult. With most the players you can buy, yes, the analog section is slewing, and is not linear at all in the 1 to 10MHz region. The ultrasonics are folded down into the audio range, and intermodulates with the audio signal.

The BUD technique has special merit for delta-sigma converters, since many have trouble with stability of the internal noise-shaping algorithm. The internal noise-shaper is actually high-order digital feedback wrapped around a 5 to 6-bit converter, and can become unstable when the incoming digital signal falls below that physical dynamic range of the 5-to-6-bit converter. At signal levels that "fall between the bits" the converter uses a type of pulse-width modulation, and synthesizes the missing intermediate values by generating pulse-width patterns that average out to the desired analog value.

This type of behavior is revealed by slowly modulating a DC level from full positive to full negative, and measuring the noise level coming out of the converter. Some of the most popular delta-sigma converters have sudden jumps in the noise floor of 20dB, as the converter shifts from one type of PWM pattern to another.

For more info about measuring noise-shaping instabilities, see page 20 of:

http://resonessencelabs.com/wp-content/uploads/2012/05/InvictaMeasNotes.pdf

If you have the time, watch the ESS video presentation at the 2011 RMAF show:

http://www.youtube.com/watch?v=1CkyrDIGzOE

Given these subtle problems, why are delta-sigma converters so popular? There are three reasons:

1. Cheaper. The cheapest are $1 to $3 each, and most expensive at $40 each. That's for an 8-channel converter, by the way. By comparison, the last remaining ladder/R2R converter is now $75 each—for one mono channel.

2. Measures better than ladder converters. This gets most engineers right there. Specs dominate in the DAC world, and carry a lot of weight with magazine reviewers, too.

3. Easier to design with. The cheaper converters have built-in opamps (not very good ones) that deliver a buffered, ready-to-use 2Vrms output. By comparison, the care and feeding of a current-output ladder converter is a fine art, and not just a matter of simply following the application note from the chip vendor.

These three hit the trifecta for most DAC designers. Note that sound quality is orthogonal to the trifecta; maybe ladder converters sound better, maybe they don't, but price, specs, and ease-of-use trumps sound most of the time. So the majority of the DACs out there, including the majority in the high-end business, use delta-sigma converters and opamps for active I/V conversion, active low-pass filtering, and current amplification for the final output stage. As mentioned above, audio-grade opamps are not linear in the 1 to 10MHz range, so they are not well suited for handling the ultrasonic noise coming from the converter.

Thorsten Loesch has an excellent discussion of these points at:

http://www.scribd.com/doc/105561243/Thermionic-Valve-Analogue-Stages-for-Digital-Audio-A-Short-Overview-of-the-Subject-by-Thorsten-Loesch.

Returning to the BUD proposal, why add more noise, when the analog stage is already having trouble keeping up? Well, the incoming digital signal is reduced by -6dB for a ladder converter (which does not use noise-shaping), and -12dB for a delta-sigma converter (which does use noise-shaping). There's no risk of clipping the digital or analog signal, and now the noise is constant, with the same spectra as DSD.

The ladder converter is linearized, since all the bits are continually stimulated at a frequency range far above the audio range, and the delta-sigma converter avoids instabilities with the noise-shaping algorithm, since the wideband dynamic range has been reduced to 3dB. The following analog section thinks it's getting a DSD signal, since the spectra is filled with a steady level of ultrasonic noise while the audio band is quiet.

Yes, BUD is funky. I don't think the it's the best approach for a genuinely high-level DAC, which can handle ultrasonics with no trouble (thanks to a combination of passive pre-filtering combined with a very fast, non-feedback analog section) and converters that are free of noise-shaping instabilities (ladder \converters and possibly the ESS Sabre 9018). For DACs at this level, user-selectable dither levels is probably a good idea, allowing a selection from the LSB to higher bits; different converters will probably have different preferred levels.

For the majority of DACs out there, with delta-sigma converters and slow opamps in the signal chain, BUD is most likely an improvement, and probably really will make PCM sound pretty close to DSD. This is a back-handed compliment, since DSD on most delta-sigma DACs is not all that great, and adding ultrasonic noise to a marginally-engineered DAC is not going to make it wonderfully better—but maybe more listenable, which is what it's all about.

P.S. Yes, this idea might be patentable. I'm not going to do that, but place it in the public domain instead. Maybe somebody in the audio world is already doing it and isn't talking. Well, the cat's out of the bag now, and the trade secrecy just evaporated, since I haven't signed any NDA's with any manufacturer in a long, long time.