Resolving for Resolution
As a little boy I thought it would be fun to earn some candy money as a target flag waver at the local gun shooting range. The targets were mounted on suspended frames that could be pulled down by the flag waver into a protected bunker right below them. After inspecting the location of the bullet hole after each shot, I had to raise the appropriate flag indicating to the shooter where the bullet hit the target and then raise the target again for the next round. The targets were 300m from the gun stands, and I was standing in a bunker with just a small opening straight up towards the targets; therefore the noise from the guns was not very loud where I was standing. What I didn't realize at the time when I applied for the job was the impact of the sonic boom from the bullet flying just inches above my head through the target. It wasn't that this boom hurt my ears with sheer decibels; it actually was not that loud. There was something else, something terrifying about it, something so stressful that some of my peers at the range developed a headache after a while.
It would take me years to understand why.
Later on, during a vacation in an idyllic village on the Mediterranean, I had my bedroom window open all night and the beach was right below it. In the evening the sea breeze would always die off completely, and so would the waves. It was dead quiet. Around 4:00 a.m., however, the breeze would pick up again, and so would the waves. But the slow rolling sound of the waves would not wake me up; it was the church bell at 7:00 a.m. that did that. The church was around a few corners and not very close, so the sound of the bells was about the same level what the waves produced right in front of my window.
These are just examples illustrating of how sounds that slowly rise and decay in volume don't seem to get our attention as much as sounds that have a much more sudden character. Church bells have a sharp attack and a long decay, and the sonic boom of the flying bullet has such a sharp attack and decay that it can terrify us, or even cause headaches. Mother Nature taught us through evolution that sounds with short attacks mean more danger than sounds with slower attacks. Short attacks are created by objects traveling fast, and so they can hurt us when they hit us. Since one of the functions of our ears and the associated processing in our brain is to protect us, they have been designed to wake us automatically when they hear sounds that mean danger, but not when they hear other types of sounds. The flying bullet creates such a sudden bang that our entire nervous system is activated to make sure we move out of harm's way as quickly as possible. Our ears never rest, and always work—even when we sleep.
Just try to program the alarm in your smart phone with a slowly rising sound that then slowly decays. Most likely you will not wake up. But as soon as you change the sound to a series of short bursts with short attacks, without changing the volume, you will then wake up within seconds.
How does our hearing perceive such transient signals with rapid volume changes? It turns out that our ears are very sensitive for transients with short attacks from low volume to louder volumes, much more than from high volumes to lower volumes. They create lots of harmonics way beyond our "standard" audio range of 0-20Khz. The bullet described above travels at Mach2 and creates one of the shortest possible rise times that our hearing system literally freaks out and pulls all triggers in its attempt to make us survive the imminent danger.
Several studies at the AIST institute in Osaka, Japan, have shown that bone-conducted hearing is ultrasonic (meaning above 20kHz) and provides inputs to the cochlea. The study has shown that such ultrasonic hearing is not linear and quite complex. Scientists speculate that the saccule is involved, the sensor for gravity and acceleration. In other studies published in the Journal of Neurophysiology in 2000, brain activity has been measured when the listeners were subjected to audio limited to the classic range of 20kHz, audio containing ultrasonic components, audio containing very low frequency components and audio containing low frequency and ultrasonic components. It was interesting that the combination of ultrasonic frequencies and very low frequencies caused the most brain activity. The ultrasonic frequencies reached up to over 100kHz.
Without going into much scientific detail of the various studies that have been made, the point is that we DO hear beyond 20kHz, and also down to the single Hertz's. It may not be the same process, the same resolution and through the same sensors as our "regular" hearing, but we still perceive sounds in those "inaudible" spectra to some degree.
If we can believe these scientists and my own experiences at the shooting range, on the beach and many other places, then chopping off the frequency range at 22kHz for the CD is probably not the greatest idea for the best sonic performance. If we wanted to match the frequency response of our "ideal" audio system to our hearing system we would want it to be relatively flat up to at least 20kHz and then it can decay gradually and gracefully for higher frequencies. But wait, isn't that what we had already way back when vinyl and analog tape were kings? Both were designed for the optimal performance in the 20kHz band, but had still some performance or resolution beyond that with a gradual decay up to at least 50kHz. No wonder many of us still prefer analog over digital.
Most of us associate digital with PCM, because that is what CD is. In the quest for better performance the sample rate has been pushed up from 44.1kHz to twice, four times or even eight times that. But the fundamental issue is still the same: there is a flat band with a sharp cliff at the end. The cliff is generated with a brick wall filter that generates unnatural side effects in form of pre-ringing and high frequency distortion. Many equipment manufacturers are also intrigued by measurement data and design for highest possible signal-to-noise ratio in the audio band, which of course, makes the whole cliff even higher and steeper.
Too bad the engineers of Sony and Philips didn't talk to NASA when they designed the CD more than 30 years ago. At the time the NASA engineers were very familiar with the Gibbs phenomenon that occurs at the edge of their dish antennas designed to receive faint radio signals from outer space. This effect can be very detrimental, and occurs in general terms always where there is a sharp edge or a discontinuity in your transfer function of the system you are designing. The classic analog audio system doesn't have such a sharp edge, but every PCM system does. A sample rate of 192kHz for instance, still produces a cliff or a sharp edge at around 96kHz. According to the above explanation, it is possible that this is still within the range where we can perceive audio to some degree.
It sounds like we should revisit the whole concept of digital audio and mandate that the new digital system emulates the frequency range of our classic analog gear as much as possible, and doesn't do anything abrupt in the form of a cliff. Luckily, by the time the CD was about 15 years old, there was a new set of engineers at Sony and Philips who did just that. They took an already old concept of delta-sigma modulation and applied it to audio and called it DSD. At the time various factors forced them to use a sample rate of 2.8MHz, but meanwhile technology has advanced and packaged media is no longer the driving force in our industry, and therefore the limitation on the sample rate is disappearing. Recording and playback systems are now available that support 5.6Mhz as a sample rate, and attempts are already being made for yet higher rates. As explained in an earlier article, the frequency range of DSD does not have any cliffs, has a flat area for the classic audio range, and then decays slowly at higher frequencies. Because of its very high sample rate, the usable frequency band goes up to 1.4MHz for the slowest sample rate of 2.8Mhz.
In my own observations with Alzheimer patients subjected to CD Redbook music and then to DSD recordings, there are always patients who are bothered by Redbook audio and ask to turn the volume down. But there are never any such complaints with DSD recordings. In contrary, seemingly lifeless faces lighten up, bodies start to move in rhythm, people who haven't spoken a word in months start to sing and many attempt to dance. It seems that Alzheimer patients can be very sensitive audiophiles, more than they were before they had the disease. How is this possible? For one, the area in the brain that processes music is the last to be affected by the disease, but maybe it is also that the conscious mind tends to distract the healthy patient from the pure listening experience. Alzheimer patients may possibly have an advantage there. They just listen with very little other distracting brain activity and, therefore, may become more sensitive to distortion and the brick wall effects that PCM audio can cause. DSD may even have an application with people that otherwise have lost all means of communication with the world they are living in.
Over the years many listening and comparison tests were made with DSD vs. PCM, and DSD vs. analog. While it is very difficult to set up such tests in a true A/B comparison, the majority of these tests showed that DSD can be closer to analog than PCM. Maybe my non-scientific essay makes this somewhat plausible now.
Before you head out and try this at home, I suggest you leave the shooting range out and head straight to the idyllic place on the Mediterranean.
About the author: Andreas Koch was involved in the creation of SACD from the beginning while working at Sony. He led a team of engineers designing the world's first multi-channel DSD recorder and editor for professional recording (the Sonoma workstation), the world's first multi-channel DSD converters (ADC and DAC), and participated in various standardization committees world-wide for SACD. Later he went on as a consultant to design a number of proprietary DSD processing algorithms for converting PCM to DSD and DSD to PCM, and other technologies for D/A conversion and clock jitter control in DACs. In 2008 he co-founded Playback Designs to bring to market his exceptional experience and know-how in DSD in the form of D/A converters and CD/SACD players. Earlier, he was part of an engineering team at Studer in Switzerland designing one of the world's first digital tape recorders, then led a team of engineers working on a multi-channel hard disk recorder. He did a 3-year stint at Dolby as the company's first digital design engineer. All of this gave him a well-rounded foundation of audio know-how and experience. He can be reached at firstname.lastname@example.org.