Digital Recording White Paper
Most audiophiles do not understand exactly how digital recording works, so allow me to shed some light on the subject. Since I do not know your level of technical knowledge, I will start with the basics. I apologize if some of this is review, so please bear with me.
I will be discussing mainly linear PCM (Pulse Code Modulation) recording because this is the format used by CDs. There are other digital systems (like SACD and DSD), but they are not as good as linear PCM, are not generally commercially available, so I will only briefly mention them.
The linear PCM standard used by CDs is specified in detail in the "Red Book" developed by the Sony and Phillips engineers when they invented the CD back in the 1980s. I will be referring to this standard often throughout this discussion.
The Red Book engineers wanted to produce a digital recording system that would reproduce music perfectly -- but use the minimum amount of data so that they could maximize the recording time on a CD. They wisely did not compromise performance -- but nor did they use more data than what was necessary.
Analog recordings of the day were compromised in many ways. Specifically, the frequency bandwidth of LPs and FM multiplex broadcasts were limited from 30 Hz to 15 KHz. The S/N (Signal to Noise ratio) was limited to around 40 dB. Even the best studio, open-reel tape decks could barely achieve a bandwidth of 20 Hz - 20 KHz with a S/N of 68 dB. By the late 1970's better tape oxide formulations got the S/N up to 72 dB. No analog system could capture the full dynamic range of a symphony orchestra. None had a silent background.
Analog tape decks have loads of flutter caused by imperfect capstan bearings, capstan shafts that weren't round, and tape scrape flutter (where the tape moves in tiny jerks across the tape head). LP's were plagued with "wow" due to eccentricities in the disks caused by the center hole not being perfectly concentric with the record grooves. Both wow and flutter are inaccuracies and variations in the frequency.
Tape decks also suffered from amplitude flutter caused by variations in the tape coating thickness. If you recorded a steady state tone from an audio generator, you would see +/- 2 dB fluctuations in the output from a good tape deck. This flutter is easily be heard on music that has sustained tones. A good example is slow, sustained piano music.
The frequency response of these analog recording systems was not only limited to just a portion of the human hearing range, but the linearity of the frequency response was quite poor. It is typical to see frequency response variations greater than plus/minus 3 dB.
As if all this weren't bad enough, the THD (Total Harmonic Distortion) and IMD (InterModulation Distortion) of analog recording systems often exceeded several percent. As a result, analog recordings always sounded obviously different than the live microphone feed. Analog tape recorders simply could not capture, store and reproduce music with accuracy.
LPs had substantially worse performance than open reel tape. So by the time you made a recording on analog tape and then transferred it to a vinyl LP, the accumulated errors were severe. As a result, the LP playback on an audio system was quite different from the original, live performance. A vastly better way to record music was needed. The reason that digital recording was developed was to resolve the serious problems and limitations caused by analog recording.
The Sony and Phillips engineers who invented digital recording decided to produce a digital system that would solve all these problems. This meant that the S/N of the new system must be greater than 86 dB. This is minimum needed to produce the full dynamic range of a symphony orchestra, which is about 72 dB. The noise floor needs to be at least 10 dB below that to produce a silent
background. So they settled on a S/N at a very conservative 96 dB, which was 10 dB better than the minimum required.
The engineers wanted to capture and reproduce the full frequency range of human hearing, so their CD was able to record 20 Hz to 20 KHz. Actually the nature of a CD is such that it will record right down to DC, which is zero Hz. But the highs are limited to the extremes of human hearing at 20 KHz.
Most adults cannot hear 20 KHz. But the Red Book engineers made no compromises, so pushed the frequency response all the way to 20 KHz. By comparison, analog recordings were limited to 15 KHz.
They insisted on having extremely linear frequency response. Plus/minus 3 dB produces very obvious flaws in the reproduced sound, which simply was not acceptable. So the Red Book frequency response was specified to be better than 0.1% across the entire bandwidth.
The Red Book engineers would not accept any short-term frequency and amplitude variations. They found the usual analog wow and flutter errors in analog were in excess of 2%. These flaws ruined the realism of the sound and could not be accepted. To solve this problem, the engineers used a quartz clock instead of mechanical devices to lock the frequency and amplitude to incredibly low levels -- less than 0.001%!
Finally, the THD and IM distortion of the system had to be reduced to a tiny fraction of a percent. A typical CD player will have digital distortion well under a thousandth of a percent. The usual limit on distortion will be the inherent distortion in the analog input and output buffer amplifiers, which will be far more than that produced by the digital system.
Now let's look at how a digital system works to solve all the above problems. You have surely noted the two key specifications on a PCM system. They are the sampling rate and the bit depth. Just what do these do and how do they work?
Most audiophiles completely misunderstand how they work. For example, they think that the sampling rate defines the "resolution" of the system. They imagine that the sampling rate defines how many times the wave form is sampled in one second and that the wave form is then reconstructed during playback as discrete points. They then further imagine that these points are connected by straight lines that form "stair steps" in the digital wave form.
Now based on this view of digital recording, it is completely understandable that these audiophiles would conclude that a digital wave form is missing information compared to the original analog wave form -- and that the digital wave form is not smooth. They would further assume that a higher sampling rate would provide more detail to the wave form, thereby increasing the "resolution" and accuracy of that wave form.
Their view is total fantasy. Digital systems simply do not work that way.
In particular, the whole purpose of a DAC (Digital to Analog Converter) is to produce a perfectly smooth, complete, and accurate wave form. There are absolutely no "stair steps" in a digital wave form. In fact, a digital recording system produces a far more accurate wave form of the original signal than any analog system can.
If you doubt this, let me point out that if there were "stair steps" in the wave form, the distortion would measure extremely high (higher than 50% THD). But the distortion in digital signals is vastly lower than any analog system, measuring only a couple thousandths of 1% at worst, while analog recording systems measure several percent.
In short, a digital system produces an essentially perfect and complete wave form. There are no steps in it. It is more accurate than the wave form produced by an analog recording system.
So if the sampling rate doesn't determine the "resolution", just what exactly does it do? Before explaining, let me point out that there is no such specification as "resolution" in audio engineering. This is another audiophile myth. Therefore the sampling rate does not define resolution, it defines the highest audio frequency that the system can capture, store, and reproduce.
In a linear PCM system, the sampling rate must be twice the highest frequency of interest (known as the Nyquist frequency). Since the Red Book engineers wanted to reproduce 20 KHz music, they had to sample the music at twice that -- 40 KHz.
You may note that Red Book CD does not sample at 40 KHz. It samples at 44.1 KHz. Why?
A bit of extra bandwidth is required because the digital system must not be fed any frequencies higher than its Nyquist frequency as these will confuse the system and produce a lot of distortion. So all frequencies above 20 KHz must be eliminated by a filter. This is called an "anti aliasing filter."
The anti-aliasing filter will require some additional bandwidth in which to operate. By using a digital filter, the Red Book engineers were able to roll off the high frequencies at 96 dB/octave, thereby needing only 4.1 KHz of additional sampling to accommodate it.
Note that this means that in the worst case (20 KHz), the digital system will only sample the wave form twice. So if the audiophile belief that digital systems produced "stair steps" in the wave form, then a 20 KHz sine wave would actually be reproduced as a square wave.
But it is easy to see that this is not true. Feed a 20 KHz sine wave into a digital system like a digital signal processor or digital crossover and observe its output on an oscilloscope. The digital
components just described will feed the analog signal through an A/D converter to digitize it, then back out through a DAC to convert it back to analog. So the signal will have gone through a pair of digital converters. If audiophiles were correct, you would see a square wave at the output of the DAC. But instead you will see that the output is a perfect sine wave of vanishingly low distortion -- it will not be a square wave.
In short, linear PCM does not have any "missing pieces" or "stair steps" in the wave. The entire purpose of a DAC is to reconstruct the wave completely accurately and with virtually no
distortion. They do so magnificently.
So where does upsampling fit into this picture? To begin, let me point out the obvious, which is that one cannot add information or replace "missing pieces" of a wave form after the fact. So upsampling cannot produce accurate musical information where none was originally recorded anymore than one can reconstruct a drop-out on analog tape.
So what is the value of upsampling? Not much actually. But remember the extra sampling needed to produce a sliver of bandwidth for the anti-aliasing filter I mentioned earlier? It was only 4.1 KHz wide and required that a digital "brick wall" filter be used at 96 dB/octave. This conserved data space on the CD.
Some audiophiles believe that such a steep filter degrades the sound -- even though the filter operates in the supersonic range, which is well above those frequencies that humans can hear. They believe that a more gradual, analog filter will sound better. By upsampling the data stream, they can add all the bandwidth they want and by doing so, they can use analog filters.
Can the effect of analog anti-aliasing filters be heard? Obviously, anything that does not produce frequencies in the range of human hearing cannot be heard. But that doesn't keep some audiophiles from believing that they can hear the effects of supersonic filters. So some CD player manufacturers use upsampling thinking that will please audiophiles.
Now let's look at the word length. Red Book CDs operate using 16 bits. Why? What effect does the number of bits have on the sound?
Simply put, the word length defines the S/N of the system. Each bit is worth 6 dB of S/N.
As mentioned previously, one needs a S/N of at least 86 dB to produce a silent background. Sixteen bits will produce a S/N of about 96 dB, which is about 10 dB better than required.
Actually, in the real world, for many technical reasons including the need for "dither" and the fact that very few analog electronics can produce a S/N of 96 dB, most CD players only produce a S/N of about 92 dB. But this still produces a silent background and full dynamic range and is far better than any analog recording system.
The Red Book engineers picked the number of bits required to achieve a silent background and record the full dynamic range of all music. They did not use any more bits than necessary, nor did they include extra bits that would waste data space. Simply put, 16 bits is the number of bits required to reproduce music with a perfectly silent background. I'm sure you would agree that all properly-recorded CDs have silent backgrounds. You do not hear hiss and noise like you do with analog recordings.
So why would anybody want to use more than 16 bits? What would this gain you?
Many audiophiles believe that using 24 bits will produce better recordings. But when pressed to explain why this would be, they can't tell you.
The truth is that 24 bits will produce a digital S/N of 144 dB. Note that I said "digital" S/N. In reality, we can't listen to a digital signal. We must convert it to analog to play it through a speaker. There is no analog system that can produce a 144 dB S/N. About the quietest analog electronics can be is around 120 dB.
But the quietest microphones have a S/N of only 92 dB due to Browning Effect. This is the noise caused by the vibration of air molecules at room temperature striking the diaphragm of a microphone and causing it to make a small amount of noise. So it is virtually impossible to record music with a S/N greater than 92 dB.
A digital S/N greater than that is of no practical use when playing back music. After all, a silent background is silent and it is impossible to make silence any quieter. So adding bits on playback is simply a waste of data space.
Although there is no point in using more than 16 bits on playback, there is a good reason to use more than 16 bits when making a live recording. Understand that to get the full dynamic range of a 16 bit system, you must accurately place the dynamic range of the music in the 16 bit "window." If the recording level is too low, you won't use the entire 16 bit range and you will hear background noise on quiet passages of music. If you have the level too high, you will exceed the maximum level defined by the 16 bits and massive distortion will result.
Now when playing back music, the recording engineer will always know the levels and it is a simple matter for him to place his recording correctly in a 16 bit window. But when recording -- especially when recording live concerts -- the maximum sound level is not exactly known. So the recording levels must be conservative as exceeding the maximum digital recording level will result in massive distortion that will ruin the recording.
So for recording, it is best to have some extra headroom. Therefore, recording studios use 20 or 24 bit recording systems.
The extra headroom provided by more bits also is useful when the recording engineer needs to do mixing and processing where equalization may be desired. Boosting the energy at some frequencies using equalization requires more bits and might exceed the maximum digital limit. Once the recording is made, mixed, and processed, the final product can then be accurately placed in a 16 bit window so that it has a silent background.
So for recording, 20 bit or 24 bit systems make sense. But there simply is no point is using more than 16 bits for playback.
This brings up the topic of "High Resolution" (Hi-Rez) audio. Many audiophiles believe that higher sampling rates and more bits increase the "resolution" of the recording. This is utter nonsense for all the reasons that I have outlined above. The Red Book CD standard makes essentially perfectly accurate recordings and increasing the sampling rate and word length doesn't not make the recording any more perfect.
The latest fads in sampling rate is to use 96 KHz or even 192 KHz sampling. 96 KHz will record sounds up to 40 KHz (80 KHz captures the 40 KHz sound and the remaining 16 KHz are used for the anti-aliasing filter). 192 KHz sampling will record 80 KHz sounds (160 KHz sampling captures 80 KHz sounds while the remaining 32 KHz are used for the anti-aliasing filter).
Now think about that. What good does it do to record 40 KHz sounds? No music microphone records above 20 KHz, so the additional 20 KHz available in a 40 KHz recording system simply captures supersonic noise and wastes 50% of the data space.
A 192 KHz sampling system is even worse. Fully 75% of the bandwidth is used to record supersonic noise and wastes 75% of the data space.
Of course, these higher sampling rates are also combined with 24 bits. So the wasted data space is much greater than just described as one needs about 30% more data space for the extra bits.
Amazingly, these so called "Hi-Rez" recordings actually degrade the sound quality. This is because the supersonic noise will produce intermodulation products (beat frequencies) down lower in the audio range.
For example, noise frequencies at 37 KHz and 38 KHz will interact together to form intermodulation frequencies at 1 KHz, which is a frequency humans can hear. So hi-rez recordings will actually produce noise and distortion in the audio bandwidth, which degrades the sound while making no improvement in the sound in other ways.
Fortunately the amount of distortion and noise these systems add is small enough that most humans can't hear it. However, sensitive instruments like a distortion analyzer easily reveal the flaws.
There is an interesting article on this subject that I think you will enjoy. Here is a direct link to it:
http://drewdaniels.com/audible.pdf
DSD and SACD digital formats work differently than PCM encoding. They actually do produce steps in the musical wave form because they do not use a DAC to smooth it out. As a result, they are extremely noisy and have high distortion levels.
To deal with these problems, engineers use "noise shaping" to move the noise and distortion up into the supersonic region so we cannot hear it. This result is that they sound as good as a CD. They don't measure as well as a CD, and their flaws are still present, but a human cannot hear their flaws.
Because DSD and SACD do not use a DAC to produce a smooth wave form, they must sample at extremely high rates in order to make the "stair steps" in their wave forms small enough to keep distortion at a reasonably low level. This is why the sampling rate of these formats must be several MHz. DSD samples at 2.8 MHz and some of the newer DSD formats sample as high as 8 MHz.
Sampling at such high frequencies requires a huge amount of data storage. Since data storage costs money, it is very unlikely that these formats will gain wide acceptance or that the major recording labels will release music on this format. It simply makes more sense to use a DAC and PCM encoding to make recordings that are technically better and use less data than DSD. So do not expect DSD to take over the market any more than the now defunct SACD did.
Digital recording systems simply produce much more accurate recordings than analog can. For this reason, all modern recordings are made using digital equipment. Also, because digital recordings are essentially perfect, they can be copied repeatedly without any degradation.
By comparison, the serious flaws in analog recording mean that every copy is substantially worse than the one from which it was made. By the time an analog recording has been copied several times, its audio quality is so bad that it is unlistenable. It should now be clear why digital recordings sound better than analog ones.
If digital recording is more accurate, why do some audiophiles think that analog recordings sound better? The cause of poor sound from digital recordings is caused by the poor quality of the recording. In other words, garbage in gets you garbage out, no matter how accurate the recording system is.
Most of today's recordings are "engineered" to sound good in cars. To do so, they have been highly compressed, their frequency response has been altered, and they have a lot of artificial reverberation. Such heavily processed recordings do not sound natural when reproduced through a high quality audio system.
Many of the old recordings you find on LPs were made before the availability of inexpensive mixing equipment. So some old recordings were recorded in a very simple and pure way in excellent acoustical environments instead of in sterile recording studios. As a result, these recordings sound very natural and realistic.
What this means is that many modern digital recordings sound awful, despite the perfection of digital recording. Many old recordings sound wonderful, despite the flaws of analog recording.
Because audiophiles fail to do valid testing, they do not understand the cause of the poor sound they hear. So they make false assumptions. They assume that it is the digital recording medium that is the cause of the poor sound quality, when in fact it is the nature of the recording that is at fault.
In summary, digital recording is vastly superior to analog recording. Only digital recording can accurately record music, which is why all serious recording engineers use digital equipment.
To this point, I have discussed the most pure and common type of digital recording -- linear PCM. But the cost of data storage and transmission brings us to what is becoming the most popular digital format -- MP3.
MP3 uses complex computer algorithms to reduce the amount of data required. The algorithms are highly detailed and complex and there is insufficient time to discuss them in detail in this opus. Suffice it to say that MP3 is able to record high quality sound while dramatically reducing data storage and transmission requirements compared to linear PCM (or DSD) recording. The use of MP3 is why iPods can store thousands of songs and why you can listen to music over the internet.
When discussing MP3, it is essential to understand that the recording quality is defined by MP3's data rate. The data rate defines the amount of data that are processed per second and greatly affects the sound quality. The data rate is measured in KBPS (KiloBytes Per Second). Basically what this means is that higher data rates improve the sound quality at the expense of higher data storage requirements.
The relationship between data rate and sound quality means that you cannot simply use the term "MP3" without also specifying its data rate. Low data rate MP3 sounds very different from high data rate MP3, so the two are vastly different. Therefore audiophiles can't simply make a blanket statement like "MP3 sounds badly." Faults can be heard in low data rate MP3, but high data rate MP3 can sound flawless.
Specifically data rates of 64 KHz and below significantly compromise sound quality. You clearly can hear a difference between the source and the recording at low data rates. The type of music and its inherent amount of compression has a big influence on whether you can hear the difference, but in general low data rates are only acceptable for speech -- not music.
However, when you get to 128 KBPS and higher data rates, it becomes quite difficult to hear any difference between the source and the recording. In my tests with groups of "golden ear" audiophiles, most could not detect a difference between the source and the recording when a 128 KBPS data rate was used.
However, 128 KBPS is not quite perfect. The ability to hear faults is highly dependent on the source material. Typical "pop" recordings with lots of percussive sounds generally sound perfect at 128 KBPS. The most critical material was quiet orchestral works. A sustained piano note was the most taxing test and most listeners could hear a slight difference at 128 KHz data rates.
If you want MP3 recordings to sound exactly like the source, you must use 192 KBPS or higher. 192 KBPS is considered "CD quality." No human can hear any difference between an MP3 at that rate (or higher) and a CD.
Note that because most audiophile tests fail to control the variables in their testing that they are easily deceived. For example, they often hear differences in the sound and blame those differences on the recording format, when in most cases, the format is not responsible for the differences.
If you have not already done so, it is essential that you read my "Testing White Paper" to understand this. When I say that no human can hear any difference between an MP3 at 192 KHz, this is true -- but only when the listening test properly controls all the variables.
Some of the better on-line music sources like www.pandora.com give you the option of selecting MP3 at 192 KBPS. Other on-line music sources only use high data rates. For example, www.mog.com streams at 320 KBPS. So be sure you check out the data rate used by your favorite on-line music source. Rest assured that if you use a suitably high data rate that the sound quality will be essentially perfect.
Some audiophiles believe that MP3 sounds badly. But the truth is that it sounds exactly like the source as long as the data rate is 192 KBPS or higher. This is a good thing because the future of music is on-line and music services will continue to use MP3 to conserve data.
In summary, you do not need to buy hi-rez recordings. CD quality is so accurate that you can't hear any difference between a properly-recorded CD and the original microphone feed. Therefore it is impossible to get better sound from any other source. But since the future of music is on-line, you will be getting your music in MP3 format. As long as it has a high data rate, it too will sound flawless.
Balanced vs. Unbalanced Operation White Paper
There is a great deal of confusion and misunderstanding among audiophiles about balanced and unbalanced operation. Most audiophiles feel that balanced operation is better, but they can't tell you why. So let me take this opportunity to explain how the two systems work and describe their advantages and disadvantages.
Naturally it seems sensible to prefer a balanced system. After all, doesn't it seem like a "no-brainer?" Surely a balanced system would be superior to an unbalanced one. Who wants something that is unbalanced?
But the truth is that the term "balanced" is a misnomer. There is actually nothing balanced about a "balanced" system.
So what exactly is a balanced system and how does it compare to an unbalanced one? Simply put, a balanced system is where the signal and the chassis grounds are separated. In an unbalanced system, the two grounds are combined.
What this means is that a balanced system can have better shielding and therefore can be more effective at rejecting external noise fields than an unbalanced system. Balanced systems also can reject certain types of internal noise if this noise is in the form of "common mode."
Specifically, RFI (Radio Frequency Interference) and EMI (Electro Magnetic Interference) are external noise problems that often plague professional applications. Therefore balanced equipment will be required to eliminate the noise.
For example, when I do live recordings in a concert hall, I have to deal with very large amounts of RFI from the multi-thousand watt light dimmers in a concert hall. These light dimmers operate on the principle of pulse-width modulation that causes the dimmers to switch the power to the lights on and off very rapidly (typically 120 times per second). When you switch high power like this, radio frequencies are produced. This can produce "static" in the recording.
The microphones have low output and usually will have long cables connected to them. This makes them very susceptible to RFI. So to eliminate the RFI problem, balancing transformers are used to convert the unbalanced microphone signal to a balanced one for transmission through the cables to the microphone preamplifiers. At the microphone preamplifiers, another transformer will be used to convert the signal back to unbalanced for reasons that I will explain in a moment.
These balancing transformers not only separate the grounds to produce a balanced signal, but they also convert the high impedance of the microphone (typically 50,000 ohms or so) to low impedance (typically 50 ohms). Low impedance not only increases the ability of the system to eliminate the noise, but it also prevents frequency response errors. These would occur due to relatively high capacitance produced by long cables from forming an electronic filter that will roll off the frequency response. So using balanced, low-impedance operation makes it possible to have long cable runs (hundreds of feet) without problems with either noise or frequency response.
There is no free lunch in physics, so you would expect that balanced operation comes with a price. It does indeed. That price is that balanced operation is more complex than unbalanced operation and therefore there are more electronics in the signal path. The additional electronics produce more noise and distortion than you would get from an unbalanced system with similar electronic circuits.
In the case of the microphone preamplifiers in the above example, we want them to be as quiet as possible. If we ran the signal balanced through the preamplifier, we would have at least two input transistors or tubes, and more likely, double the entire preamp electronics.
As a result, the preamp would be twice as noisy as operating it unbalanced. So to eliminate the noise at this critical location in the signal path, it is standard practice to unbalance the signal where it enters the microphone preamplifiers so that there is only one input device used.
Another example of where balanced electronics are necessary is when running long cables from a recording booth to a mixing room. In this case, there are usually many cable channels running together inside a metal conduit. These cables will have audio signals running through them that can induce electrical currents in their neighboring cables. This causes crosstalk, where the signals and sound from nearby cables "bleed" into the ones next to them.
Once again, balanced operation offers superior shielding and this can prevent the crosstalk. Similar problems are present in most professional applications on stage, so balanced operation has become standard for professional use.
Yet another advantage of balanced operation is the ruggedness of the connectors. Professional XLR connectors have locks on them that prevent them from becoming disconnected accidentally. Can you imagine a performer on stage holding a microphone whose cable was held in place by an RCA connector? It wouldn't take much motion on the part of the performer before the cable would become disconnected. This won't happen with a locked XLR connector.
Another advantage of balanced operation is with internally produced hum. If your electronics have a significant amount of hum, this hum will appear on both phases of a balanced system. In other words, the hum will be "common" to both phases.
Since the hum signals are out-of-phase with each other in a balanced system, they will cancel each other out, producing a quiet background. This is known as common mode rejection and does not occur in an unbalanced system because only one phase is used. So any internal hum will be passed through an unbalanced system while it will be rejected in a balanced one.
Now let's examine the use of balanced operation in audiophile systems. To begin, it should be noted that RFI and EMI are almost never a problem in the typical home environment. So there is rarely any need to operate the system balanced to eliminate external noise.
Secondly, audiophile equipment usually is very well designed, so there is no audible hum present in it. Therefore the common mode noise rejection feature of balanced operation will not be needed.
Thirdly, audiophiles generally do not like transformers in the signal path. So audiophile equipment does not use balancing transformers. Instead, the conversions back and forth between balanced and unbalanced operation is done using electronics. This adds noise, which does not occur if transformers are used. But both electronics and transformers produce distortion, which degrades the performance slightly compared to unbalanced operation.
Fourthly, because transformers are not used, the impedance in a balanced audiophile system is not changed to a lower value compared to an unbalanced system. So the cable lengths in an audiophile system will have approximately the same limits regardless of whether the cables are balanced or unbalanced.
In short, the advantages of balanced systems will not be needed or used in most audiophile applications. But balanced operation still causes higher noise and distortion than what you find in the same electronics when run in unbalanced mode.
Now it must be said that the slight increase in noise and distortion caused by balanced operation cannot be heard by the human ear. It requires test equipment like a spectrum analyzer to reveal the difference. But if you don't want to compromise, and you want the most pure sound with the least noise and distortion, unbalanced operation is superior to balanced operation.
XLR connectors are far more bulky and awkward to use than RCA connectors. While XLR connectors are more rugged than RCA connectors and are required in professional applications, RCA connectors are much more compact, easy to use, and more than adequate in home environments.
Despite the penalties and limitations of balanced operation, many audiophiles believe that balanced operation sounds best. How can this be so when it is easily proven that balanced operation has more noise and distortion than unbalanced operation?
The reason is that balanced operation has more gain. Since two phases are used in balanced operation instead of one as in unbalanced operation, balanced operation plays 6 dB louder than unbalanced operation.
Understand that when music is louder, it sounds better to us. Humans can barely hear the loudness difference when music is played 6 dB louder, so most audiophiles will not recognize that when they switch to balanced operation the music is 6 dB louder. Instead they will recognize that it sounds "better" due to the fact that the music is slightly louder.
But the truth is that the music actually sounds the same. If the listener went to the trouble to accurately match levels, he would not hear any difference between balanced and unbalanced operation (even though balanced operation is actually slightly worse).
The most important point I have saved to last. That is that the balanced and unbalanced output of a component simply cannot sound different because the signal is the same regardless of its mode of operation.
Simply put, the signal produced by the component is doubled into two phases in balanced operation, while just one signal is used in unbalanced operation. But since the two signals are actually the same, they cannot sound anything but identical.
So in summary, from a practical standpoint, it really doesn't matter whether you use balanced or unbalanced operation in most home audio systems. Both will sound identical to human hearing even though balanced operation has slightly more distortion and noise.
Personally, I prefer the simplicity and purity of unbalanced operation even though I will freely admit that I cannot hear any difference between the two. I only use balanced operation when I need to eliminate external noise, such as when doing live recording.
Dispersion White Paper
I eventually solved the problem of beaming in 1979 when I invented a way of curving a free-standing, tensioned membrane. It was my design that Martin Logan called the "Curvilinear" ESL and produced starting in 1981.
After a lot of study and experimentation with the curved panel, I came to the clear conclusion that it was actually a poor design. I abandoned it in favor of a planar ESL. Why would I fail to use my own invention?
Here is where the physics get very interesting. I eventually came to understand that there are three serious problems caused by wide-dispersion in speakers. These are poor frequency response, poor transient response, and poor imaging.
Let's look at each of these issues in detail. For examples in this discussion, I will refer to two, imaginary speaker systems. For simplicity and to eliminate confusion, we will assume that both speaker systems are perfect in every way. The only difference between them is that one will have wide dispersion and one will have narrow dispersion.
Now let's examine what happens to frequency response in both of these speakers. We will hear the sound from the narrow dispersion speaker being beamed directly to us at the sweet spot. Therefore the frequency response from the narrow dispersion speaker will sound perfect (because we defined the speakers as perfect in every way).
But the vast majority of the sound from the wide-dispersion speaker will be sprayed all over the room rather than being beamed directly to our listening location. We will therefore hear most of the sound from this speaker after it has bounced off various surfaces in the room and is eventually reflected to us at the sweet spot.
Because the reflected sounds have to travel a greater distance to reach us than the direct sound from the speaker, the reflected sounds will be delayed by the speed of sound. Another way to say this is that reflected sound will be out-of-phase with the direct sound. The phase-angle will be determined by the amount of time delay between the reflected sound and the wave length of the particular frequency of interest.
For simplicity, let's examine just one of these reflections and how it interacts with the direct sound from the speaker. Let's assume that the magnitude of both the direct and reflected sound is the same (in real life, the magnitudes can vary all over the map).
If a 1 KHz tone arrives directly from the speaker at 80 dB, it will mix with the reflected sound (at 80 dB) at some phase angle depending on how much the reflected sound was delayed. Let's assume that this particular wave arrives 180 degrees out-of-phase. If so, the reflected sound would completely cancel the direct sound and you would hear nothing.
If the reflected sound arrived 90 degrees out-of-phase, it would reduce the direct sound by 50%. If it arrived 360 degrees out-of-phase, it would increase the sound by 100%, etc.
As you move up and down the frequency range, this particular delayed reflection would interact with each frequency differently (because the wave lengths are different so the phase angle would be different for each frequency). The result is that you would hear (and measure) the speaker's frequency response as consisting of severe, alternating peaks and troughs that look like the teeth of a comb -- hence we call such frequency response a "comb filter."
Now a comb filter sounds perfectly awful. Conventional, wide-dispersion speakers would be unlistenable if it weren't for the fact that there are thousands of delayed reflections in a room, and they are all random. As a result, there are thousands of comb filters formed and to a considerable degree, they can average themselves out so that the frequency response from a wide dispersion speaker is tolerable.
But this problem assures that the frequency response will never be perfect in a wide dispersion speaker. It is often necessary to move the speaker around to get satisfactory response, which is why speaker placement is so important.
The very short and intense reflected sounds from walls directly beside the speaker are particularly troublesome as they tend to dominate the sound. Hence, smart audiophiles have discovered the trick of using sound-absorbing "room treatment" near the sides of wide dispersion speakers to help achieve reasonable frequency response.
You can now see how the frequency response of a speaker is seriously degraded by room acoustics. This background information will make it obvious why transient response is degraded by room reflections as well.
For simplicity, let's examine a single, sharp transient (like a rimshot from a drum). The transient coming from the narrow dispersion speaker will be perfect (because we said the speaker was perfect). But what happens to the transient from the wide-dispersion speaker?
Once again, most of the transient sound is blasted all over the room where it bounces off various surfaces and eventually arrives at the sweet spot after being delayed by various amounts depending on the distances each reflection has to travel. So instead of hearing one, crisp transient, you will hear "popcorn." Like a pan of popcorn popping, we will hear a whole bunch of identical transient sounds separated by very short intervals.
These delayed transient sound are actually echoes. But the typical listening room is too small for the delayed sounds to be separated by a sufficient period of time for our brains to recognize them as distinct echoes.
It is a fact that the delayed sounds are distinct. We can see that on an oscilloscope. But we don't hear them that way -- as a multitude of distinct transients. We hear them as one sound.
Our brains have learned to understand that having a bunch of rimshots close together means that there was only one rimshot with room acoustics following it. So our brains do one of their psychoacoustic tricks and sweep all the delayed sounds together to form one rimshot.
But there is a "catch." And that is that the transient time of the rimshot now includes all the delayed sounds following it. This extends the transient time we perceive with the result that the transients are now "smeared."
If you doubt this, just think of the last time you heard headphones. I'm sure you will agree that the transients you hear in headphones are crisper and cleaner than what you hear from wide-dispersion speakers. This isn't because headphone drivers are so good (actually most are pretty bad), it's simply because there are no room acoustics in headphones to mess up the sound.
Now let's look at imaging. A holographic image will be 3-dimensional. It will not only have left/right information, but it will have depth.
Left/right position in the image is determined by loudness differences between the two channels. But depth is defined by timing information (phase).
The reason you hear the violins in a symphony orchestra to be in front of the brass is because their sounds reach your ears slightly sooner than the sounds from the trumpets and trombones. This phase information has to be preserved in the recording and then reproduced accurately by the speakers for you to get depth in the image.
By now, you probably know what I am going to say. The sound from the narrow dispersion speaker will supply proper phase information to your ears. The sound from the wide dispersion speakers will be blasted all over the room and the delayed sounds will completely confuse the phase information. As a result, the sound from wide dispersion speakers (and omni speakers are the worst) will have very diffuse and ill-defined imaging.
In summary, the frequency response, transient response, and imaging of a loudspeaker is ruined by room acoustics. So to achieve outstanding performance, a loudspeaker must eliminate room acoustics as much as possible.
My curved electrostatic panel failed to eliminate room acoustics. I found that a planar panel was far superior because it beams the sound directly to you and eliminates the room acoustics. So I abandoned the curved panel. Once you hear highly directional panels, you will immediately understand.
Many audiophiles believe that a good speaker should have a wide sweet spot. But this is a physical oxymoron.
The laws of physics dictate that all stereo speakers will have an infinitely small sweet spot, regardless of their high frequency dispersion. That spot is when you are exactly equidistant from both speakers. Only when you are equidistant from the speakers can the phase information arrive at your ears simultaneously from both speakers. Obviously, there is no hope of imaging well if the sounds from both speakers do not arrive at the same time as the phasing will be destroyed.
For a speaker to have a wide sweet spot simply means that the phase information from the room is confusing the sound so badly that you can't even tell when you are in the sweet spot and when you aren't. A wide sweet spot is a guarantee that a speaker has poor imaging and transient response.
I don't compromise. I want narrow dispersion in my speakers to minimize room acoustics so that I can get the best possible sound.
"Beaming" is not a fault. It is a huge advantage. It is the only way to achieve truly high performance in a loudspeaker.
Audiophiles sometimes say that narrow dispersion speakers require you to have your head in a vise. This is nonsense. You just sit in your listening chair and listen as you would to any speaker.
And what about the off-axis performance of a narrow dispersion speaker? Well, they sound just like wide dispersion speakers when you are off-axis.
That is to say that when you are off-axis, you hear the room acoustics, not the speaker. So my speakers sound just fine off-axis for casual listening. Of course, the image is diffuse and of poor quality, just like a wide dispersion speaker when you are off-axis. So for serious listening, you need to be at the sweet spot. This is true for all speakers.
As an aside, it is best to NOT absorb the rear wave from my speakers. Let it help energize the room for off-axis listening. That way the highs will be preserved off-axis.
Finally, the question arises, "Why doesn't the reflected sound from the dipole beams mess up the phase just like in a wide dispersion speaker?" The answer is that the reflections from a dipole radiator are only one rather than thousands.
Also this beam has to bounce off many surfaces before it finally reaches the sweet spot. In so doing, it is greatly delayed and attenuated. It arrives at the sweet spot so late and attenuated that our brains simply ignore it.
Of course, directionality only applies to midrange and high frequencies. Bass is omnidirectional in all speakers due to the long wave lengths involved. So bass room resonances still need to be addressed.
There is one perilous pitfall you need to be aware of with regards to delayed reflections in all speakers, even narrow dispersion speakers. This involves sitting close to a wall directly behind you, which is very bad.
Typically this occurs if a couch is used for a listening chair. Most of the time a couch is pushed up against a wall. If you sit in it, your ears are only a few inches from a wall that is perpendicular to your head. As the sound beam from a planar speaker arrives at your ears, some of it passes by your head, bounces off the wall, and returns to your ears after a very short delay.
This one reflection is very powerful. It is almost as intense as the direct beam and it will seriously mess up the frequency response and transient response of the sound.
To avoid this problem, ideally you should sit in a chair out in the room away from the wall. Even better would be to have the system on a diagonal so that the wall behind you is angled and reflects the rear reflection away from you instead of back at you. Diagonal room placement is also best for controlling the bass resonances from a speaker.
If your room decor makes this impractical, then there are several options you have. First, you can put sound absorbing material on the wall behind your head. The could be a simple as a very soft pillow that you set on the back of the couch so that it is behind your ears. This is not as effective as I would like because it will only absorb high frequencies. The midrange will still have a reflection.
Probably the best compromise is to have a movable listening chair. When not being used, the chair can be placed in the room wherever it pleases your wife. When you want to listen seriously to music, you move the chair out into the room where you have identified the sweet spot (usually with tiny pieces of tape on the floor). Then the sound can be spectacular.
Tubes vs. Transistors White Paper
TUBES VS. TRANSISTORS
Many audiophiles question my recommendation for using solid state amplification for driving ESLs since tube equipment is often used for this purpose. There is also a lot of confusion and controversy about tubes and transistors in general. The purpose of this discussion is to examine the pros and cons between the two, present some solid engineering and scientific evidence used to evaluate them, and then draw some reasonable conclusions you can use in deciding what to use.
To do so, I must first give you some history and discuss some technical issues in a way that I hope will be understandable. I have designed and built tube amps going all the way back into the 60's. My most memorable design was a Class A, high voltage, transformerless output, direct-coupled tube amp for driving electrostats. I published the design in "The Audio Amateur" magazine back in 1976. You can still read it on my website at:
http://sanderssoundsystems.com/downloads/TheAudioAmateur0May1976.pdf
I have also used many other tube amps over the years and have also helped design the iTube, which was a conventional tube amp that was optimized to drive ESLs. The point I'm trying to make is that I don't have any bias towards one or the other type of device. I've used, designed, built, and marketed both types.
So what I am about to say does not come from any particular partisan point of view. It is simply what I have learned over the last 38 years of research into producing the best sound I could.
I have been in the unusual position (for an audiophile) of having a fully-equipped test bench, including a spectrum analyzer. This has made it possible for me to carefully do both measurements and listening tests to correlate the two and to find out the reasons we hear the things we do.
This research has been fascinating and very educational. It has also made it possible for me to develop truly high-performance electronics.
There is no doubt that we all hear differences between tube and transistor amplifiers. The big question is what is causing the differences we hear between them. After all, well-designed examples of both types measure well enough that we should not hear any differences between them. So what gives?
I spent a lot of time looking for the reasons. It was an extremely interesting and entertaining search. I don't have time to explain all the work I did over the years in these studies in this message, but will be happy to discuss them over the phone (303 838 8130) if any reader wants to know. I'll just have to summarize here.
To begin, you need to understand how much power is required to play musical peaks cleanly and without clipping an amplifier. It takes a surprising amount.
To see what is going on with an amp when playing music only requires an oscilloscope. These are very fast (the slowest ones will show 20 MHz) and will clearly show amplifier peak clipping when music is playing. A meter is too slow to do so. A 'scope is cheap (you can get them for $100 on eBay all day long). So you don't have to take my word for what I am about to explain. Feel free to get your own 'scope and examine your system's performance.
You simply connect the 'scope across your speaker or amplifier terminals (which are electrically the same), adjust the horizontal sweep as slow as you can while still seeing a horizontal line on the screen. Don't go so slowly that you see a moving dot.
Now play dynamic music at the normally loud levels you enjoy. Adjust the vertical gain on the 'scope so that the trace stays on the screen.
As music plays, you will clearly see if clipping occurs. The trace (which will just be a jumble of squiggly lines) will appear to hit an invisible brick wall. It will appear as though somebody took a pair of scissors and clipped off the top of the trace. That's where the term "clipping" comes from.
If you see clipping at the levels you like to listen, then you are not using a sufficiently powerful amplifier to play your music cleanly. Your system is compromised because your amplifier will have compressed dynamics, sound strained, lose its detail, and have high levels of distortion.
The 'scope will be calibrated so that you will know the voltage at which clipping occurs by observing the grid lines. If you know the voltage and the impedance of your speakers, you can easily calculate the power.
Power is the voltage squared, divided by the impedance. So if the 'scope measures 40 volts at clipping, and you are driving 8 ohm speakers, you know that 200 watts are being produced at clipping -- and this is insufficient power for your particular system because it is clipping.
You will find that conventional, direct-radiator (not horn-loaded), magnetic speaker systems of around 90 dB sensitivity, require around 500 watts/channel to avoid clipping. More power is needed in larger rooms or if you like to play your music more loudly than most.
The key point I'm trying to make is that audiophiles usually are using underpowered amplifiers and are therefore listening to clipping amplifiers most of the time. When an amplifier is clipping, it is behaving (and sounding) grossly differently than its measured performance would suggest. This is because we always measure amplifiers when they are operating within their design parameters -- never when clipping. A clipping amp has horrible performance, so attempting to measure it is a waste of time.
In other words, we usually listen to an amplifier when it is clipping and we measure it when it is not. This is why amplifiers sound so different than their measurements would imply. It is not that measurements are wrong, it is simply that we are listening and measuring different conditions.
It is essential to understand that when an amp is clipping, it will sound quite different than when it is not clipping. It is also important to realize that different types of output devices (tubes vs. transistors) clip in very different ways, so sound quite different when they are clipping.
Finally, it is important to realize that an amp does not instantly recover from clipping. It takes several milliseconds for its power supply voltage to recover, for it to recharge its power supply capacitors, and for its internal circuitry to settle down and operate properly again. Therefore, even though an amp may only be clipping on the musical peaks, it will not immediately operate properly at average music levels where it is not clipping.
It should now be obvious why objective measurements don't seem to give much insight into the performance of amplifiers. It is not that objective measurements aren't accurate (they are superb), but simply that we don't usually operate amplifiers within their design parameters. So we aren't listening to them at the power levels where they operate properly and where their measurements are meaningful.
Now let's analyze tube and transistor equipment with regards to clipping, since that is the condition to which we usually listen. There is "hard" and "soft" clipping. If you go back to the oscilloscope investigations, you will see that solid state amps clip "hard" in that there is an absolute, rock-solid, limit to how loudly they will play. As soon as you reach that point, they immediately clip. This point is their power supply rail voltage.
A tube amp clips "softly." This is because tubes produce a cloud of electrons around their cathodes. This cloud has surplus electrons available so that for sudden current surges (such as musical peaks), a tube can deliver more current (electrons) and voltage for a few milliseconds before they clip. So their clipping threshold is not rigidly fixed as it is in a transistor amp. It varies depending on the dynamics of the music played.
The age of the tube matters a lot in this situation. As a tube ages, its emissions decrease and it cannot develop as many electrons in the cloud. So old tubes will tend to hard clip while new tubes will tend to soft clip.
Transistor amps usually must employ protective circuitry. Tubes do not need any. Protective circuitry will trigger anytime a transistor amp "sees" an excessive or dangerous load. Generally, this means that most transistor amps will trigger their protective circuitry at or about the time of clipping. They will also go into their protection modes at very low power levels if they see difficult loads (like electrostatic speakers).
Protective circuitry works by switching off the power to the output transistors for very brief periods of time. Well-designed protective circuitry will trigger on and off hundreds or even thousands of times per second to limit the power that the output transistors must handle.
Protective circuitry sounds awful. It literally puts gaps in the music, which adds a type of grainy quality to the sound. But more importantly, anytime you flip a switch, whether it is a light switch or an output transistor, you will get a voltage spike. So protective circuitry will replace a smooth musical signal with a chopped up one that has voltage spikes on each side of the gaps in the music. Is it any wonder that transistor amps sound harsh when clipping?
In addition, when a tube amp clips, it produces a lot of lower harmonics in its distortion profile. Low harmonics are relatively benign and don't sound too badly. But distortion is still distortion and these harmonics don't belong there. Also, just because a tube amp makes a lot of lower harmonics, doesn't mean that it doesn't also make higher harmonics. It does. And high harmonics tend to sound dissonate and unpleasant.
This is easily seen on a spectrum analyzer, which shows each harmonic and the percentage of distortion it adds to the sound. It truly is an amazing tool.
Transistor amps tend to produce a lot of high harmonics. This is actually due more to the operation of their protective circuitry and all the spikes it produces. So generally, transistor amps will have more of the unpleasant higher harmonics than do tube amps.
It is important to note that if a transistor amp does not have any protective circuitry, its distortion profile will be much more similar to a tube amp than to a transistor amp with protective circuitry. The effect of protective circuitry is a very critical issue in the sound of solid state amps and should be more widely recognized for the problems it introduces to the sound.
What all this boils down to is that clipping tube amps sound rather soft and smooth. Clipping solid state amps sound harsh and edgy. I think it is safe to say that we would all agree that if you must listen to a clipping amp, a clipping tube amp is more pleasant than a clipping solid state amp.
It should now be apparent from where "tube sound" and where "transistor sound" comes. It comes from the sound of clipping amplifiers, which do indeed sound quite different.
Of course, when clipping, neither amplifier sounds good. They both lose their dynamics, sound "mushy", lose their detail, sound strained, tend to sound harsh (particularly transistor amps), and are somewhat distorted.
Note carefully that human hearing is rather insensitive to transient distortion, so even though both amps will produce several tens of percent distortion when clipping, we generally won't recognize the distortion for what it is, because it is too brief. Instead we will perceive and describe the sound as "harsh", "strained", "fatiguing", "muddy", etc.
To have a truly high-fidelity music system therefore requires very powerful amplifiers. Amplifier power is the single most important factor in choosing an amp. Without adequate power, all amplifiers sound badly. You can pick a clipping amplifier based on it not sounding as badly as another amplifier (tubes usually preferred over transistors), but if you really want clean, dynamic, effortless, and smooth sound, you simply must use adequate amplifier power.
In short, my take on amplifiers is to use a tube amp that clips gracefully if I must listen to a clipping amp. But I'd rather have an amplifier with so much power that it never clips! The sound from powerful amps is dramatically better than underpowered amps, even if they clip nicely.
There are three quality criteria that a good amp must meet. It must have inaudible noise, it must have flat frequency response, and it must have distortion of less than 1%.
Interestingly, tests conclusively show that humans cannot hear distortion of less than 1%. So even though one amp may have 1% distortion and another 0.001% distortion, they will both sound identical to us.
My spectrum analyzer will show distortion down to around one ten thousandth of one percent (0.0001%). It shows amazing differences between properly operating amplifiers. But as long as those distortion levels are below 1%, the amps will not sound any different to us.
It should now be clear that tubes only sound significantly different than transistors when you are listening to clipping amps. If the amps aren't clipping, or if you are using a component that doesn't clip (like a preamp), you won't hear any significant difference between well-designed tube and transistor equipment.
So a hybrid amp (tube front end and transistor output stage) that is not clipping will not sound any different than a pure tube or transistor amp. If it is clipping, it will sound like a transistor amp, not a tube amp, because it is the type of output stage that determines the sound of a clipping amp.
Now with the historical and general information covered, I can now turn directly to your question. So let's examine tube and transistor amplifiers with respect to their performance with ESLs (because I am a manufacturer of ESLs).
Recall that the basic quality performance criteria requires that an amplifier have flat frequency response. This is a huge problem for tube amps due to impedance variations in the load. Let me explain.
One of the laws of physics states that the source impedance must be lower than the load impedance or the load will be starved for current. What this translates to is that the amplifier's output impedance must be lower than the speaker's input impedance or the frequency response will be rolled off in those areas where there is this impedance mismatch.
Tubes are inherently high impedance devices. A large power tube like a 6550 or KT-88 has an output impedance of around 2,000 ohms. By comparison, a large power transistor has an output impedance of less than one ohm.
Tubes cannot drive loudspeakers directly due to their high impedance. To correct this problem, output transformers are used in most tube amps. These transformers have a specific turns ratios that will convert the tube's impedance from several thousand ohms to typically 4, 8, or 16 ohms.
Therefore, if you use the 8 ohm taps on the amplifier's output transformer with an 8 ohm loudspeaker, there should be no impedance mismatch, the frequency response should be linear, and the amp should deliver its maximum power. Unfortunately, this is never the case because loudspeakers do not have a constant impedance across their full frequency bandwidth.
Look at the impedance curve of any conventional loudspeaker and you will see that it varies from slightly below its "nominal" impedance to around 50 ohms. This will cause the frequency response from a tube amp to have errors. This is also another reason why tube amps sound different from transistor amps.
This impedance problem is relatively minor when dealing with conventional, magnetic speakers. But an electrostatic speaker is an entirely different animal. An ESL is a capacitor, not a resistor like a magnetic speaker. The impedance of a capacitor is inversely proportional to frequency. Therefore the impedance of an ESL typically varies from around 150 ohms in the midrange to about 1 ohm at 20 KHz.
A tube amp will be able to drive the high impedance frequency bandwidth (the midrange and lower highs) of an ESL with linear frequency response. However, at higher frequencies, the impedance of the ESL will drop below the impedance of the amplifier and the amp will then roll off the highs to some degree depending on the exact impedance mismatch and the frequencies involved.
This impedance mismatch problem can be minimized with both types of speakers by using a lower impedance tap on the tube amp's output transformer. For example if you use the 4 ohm tap with 8 ohm speakers, you will probably not encounter any impedance mismatch, so the system would then have linear frequency response.
Using the 4 ohm tap with ESLs will help, although it will still not eliminate all the high frequency impedance mismatch because the speaker's high frequency impedance will fall below 4 ohms. But probably only the top octave or two will be affected, which is hard to hear so the roll off may not be noticed subjectively.
But there is a problem when you use a lower impedance tap -- the drive voltage drops. Or to put it another way, the amplifier's output voltage is directly proportional to its output impedance.
Understand that the power available from an amplifier is a function of its output voltage. Ohm's Law is very simple and states that, "One volt will drive one amp through one ohm." With this simple concept, you can calculate virtually anything having to do with electronics as you will soon see.
Voltage is the pressure used to push current through an electrical circuit. Current is the flow of electrons in the circuit -- like water flowing though a hose. Current is measured in amperes, commonly called "amps." Power is measured in watts and is the product of volts times amps.
Resistance is measured in ohms. The term "resistance" is used in DC (direct current) circuits. "Impedance" is the same thing as resistance. But it is used when discussing AC (alternating current) circuits because the impedance often varies with the frequency of the AC.
Since power is the product of volts times amps, you can see that you must get current to flow through the speaker's impedance. This requires volts.
For example, if you have an 8 ohm speaker, how many volts must the amplifier produce to push enough current through the speaker to produce 100 watts? How many amps of current will be flowing through the speaker at 100 watts?
There are simple calculations for determining this. The volts can be calculated by taking the square root of the power times the impedance. So for the example above, the watts are 100, multiplied by 8 ohms, gives you 800. The square root of 800 is 28.2 volts (RMS).
The current can be calculated in several ways, but the most common is take the square root of the power divided by the impedance. So in this case, the current flow would be 3.5 amps.
If you have a 100 watt amplifier, you can see that its output voltage will be limited to about 28 volts. If it could produce more voltage, it could produce more power, so you know that its voltage will be limited to 28 volts or it would have a higher power rating. Of course, all this assumes that the amplifier's power supply and output impedance is such that it can deliver the 3.5 amps needed to produce 100 watts of power.
You can also calculate that if an amp can produce 28 volts into 4 ohms (half the impedance of the above example), that the current would double to 7 amps and the power would double to about 200 watts. Hence you see transistor amps with power ratings listed for both 8 and 4 ohms.
Tube amps are different in that if you reduce the impedance of the transformer from 8 ohm to 4 ohms to match the impedance of the speaker, the output voltage will drop as a function of the turns ratio of the transformer, and so will the power.
The turns ratio is the square root of the primary impedance divided by the square root of the secondary impedance. This always works out such that the voltage will drop to the point where the amplifier will put out the same power at either impedance when driving a matching load.
When driving an ESL, voltage is everything. So when you drop the impedance of the output transformer, you reduce the output that the amplifier can produce from the ESL. In short, you have to trade output for more linear frequency response. This is a huge problem. It's a battle that you just can't win.
Note that OTL tube amps don't solve this problem. They have no transformer, so must relay on putting many output tubes in parallel to lower the impedance. This quickly results in having an absurd number of tubes with all their heat and power requirements. So OTL amps do not get down to very low impedances. As a result, they have severe impedance mismatch issues and are really quite a poor choice for driving ESLs.
By comparison, powerful solid state amps typically have output impedances of around 0.02 ohms. They therefore have no trouble driving any speaker impedance with perfectly linear frequency response.
Solid state amps have high output voltages compared to tube amps. So they will drive ESLs quite loudly before clipping (unless they have protective circuitry that trips them up).
Well designed solid state amps have much lower distortion than tube amps. The best conventional tube amp I've ever measured was a McIntosh 275 with new tubes. It had only 0.3% distortion at an output level of 75 watts/channel (it clipped at 90 w/c). Most tube amps have distortion of somewhat more than 1%, even at levels well below clipping.
My specially designed iTube amp measured only 0.1% distortion in the midrange frequencies and went up to 1% by 20 KHz. It would do so at 150 w/c. But this was a special-built device and is not typical of conventional tube amps.
By comparison, most quality, solid state amps have distortion levels down around 0.002%. This is magnitudes better than tube amps. However, it is also true that humans cannot hear the reduced distortion levels in solid state amps, even though a spectrum analyzer will show dramatic differences between them.
Still, distortion is distortion. Why have any more than you must?
I think you can now see why I prefer very high power, solid state amps without any protective circuitry for driving ESLs. This is because they can drive ESLs with linear frequency response, while tube amps roll off the highs.
Solid state amps are much more powerful than tube amps and can supply vastly higher output voltages. As a result, a good solid state amp can drive my ESLs to ear-bleeding levels without clipping. And remember, it is clipping that produces "tube" or "transistor" sound. If a solid state amp does not clip, it does not sound harsh. It sounds just as clear and soft as a tube amp that is not clipping.
Transistor amps can run cool and efficient. My ESL amp runs only warm, yet can deliver the equivalent of about 1000 watts into an electrostatic speaker. No tube amp can do so, and even a relatively low power tube amp will run very hot and waste a lot of expensive electricity.
Tube amps are expensive compared to good solid state amps of similar power. Tube amps require expensive tube replacements, while a quality solid state amp is a no-maintenance, lifetime item.
Tube amps require biasing. Traditionally this had to be done by the audiophile on at least a monthly basis. This was a hassle and rarely was done, so most tube amps were always running far from their ideal performance levels.
Some tube amps tried to get around this biasing issue by using a "self-biasing" system. But this cut their power by about 30%. Some of today's latest tube amps use servo biasing systems, which are great if they work reliably. Often they don't.
Due to their high internal voltages and high temperatures, tube amps are unreliable. They often fail and have to be returned to the manufacturer for expensive repairs.
In short, tube amps can't drive ESLs linearly, cleanly, without clipping, to high output levels. So why put up with all their problems of heat, cost, maintenance, and unreliability when a properly-designed, solid state amp solves all these problems?
I therefore no longer design, manufacture, or recommend tube amps. I only build very powerful solid state amps that have no "transistor sound" because they do not clip or have any protective circuitry to ruin the sound.
Using my ESL amp on the panels of my speakers, and my Magtech amp on the woofers, results in approximately 1,400 watts of power for the speakers. This makes it possible to reproduce something like a grand piano or drum set at live levels in your listening room without clipping. You can reproduce a full symphony orchestra at Row A concert hall levels.
If you have a particularly large room or play your music at ear-bleeding levels, you can use the monoblock versions of my amps. The ESL amp will deliver more than 2,000 watts to the panels and the Magtech will deliver about 1,800 watts to the woofers. Clipping simply isn't an issue and the speaker can take the power.
This performance simply cannot be obtained using conventional tube equipment. So I really have no choice but to use and recommend excellent solid state amps.
The Magtech Regulated Power Supply White Paper
Many audiophiles have asked if the regulator in the Magtech is truly 100% efficient as claimed. The purpose of this paper is to describe how it works so that you can see that it is, in fact, super efficient. It actually does run cold and truly solves the heat problems of conventional regulators that prevent their use in power amplifiers.
So how does the Magtech regulator work? I'll explain, but readers will need to understand the basics in order to appreciate the problems and solutions involved. Since the technical expertise of readers varies, I will cover the basics. I apologize in advance if some of what I am about to say is review for some readers.
First, what exactly is "efficiency" as applies to a voltage regulator? Efficiency is the amount of energy put into a system compared to the amount of energy that you get out of it. Since energy cannot be destroyed and must be accounted for, any losses in efficiency will be reflected as waste heat somewhere in the system.
Or to put it another way, any heat that is produced by the voltage regulator is a loss in efficiency and results in less power being fed to the electronics than would be the case if the regulator was not present. The exact efficiency percentage can be calculated based on watts in compared to watts out or watts of waste heat produced.
In the Magtech's voltage regulator, you will not find any waste heat. It will pass virtually all of the watts put into it on to the amplifiers.
To see why, it is necessary to understand exactly how a power supply operates. Only then will it be possible to see how the Magtech's voltage regulator works and how it can be so efficient.
The purpose of a power supply is to produce smooth DC (Direct Current) at specific voltages to drive the downstream electronics. A basic, linear power supply consists of three sections, each having different types of output characteristics.
The first section is the power transformer. This converts the mains voltage to the voltage(s) required by the downstream electronics. The output is AC (Alternating Current) in the form of a sine wave.
A sine wave is a smooth wave form without any harmonic structure with alternating positive and negative polarity. There is one positive and one negative wave per mains cycle (60 Hz in North America, 50 Hz in the rest of the world).
The second section is a bridge rectifier. This consists of four diodes. Diodes are electric check valves that flow current in only one direction. These diodes flip the phase (polarity) of alternating waves by 180 degrees so that all the waves are in phase. Therefore the output from the bridge rectifier will be pulsating DC sines at twice the mains frequency.
The third section is capacitance. This usually takes the form of a bank of large, storage capacitors. A capacitor bank acts like a rechargeable battery in that it can store a lot of electrons and release them when needed.
The practical difference between a rechargeable battery and a bank of capacitors is speed. A capacitor bank can be charged and discharged virtually instantly, which is necessary to meet the sudden large current demands of an amplifier.
The main purpose of the storage capacitors is to smooth out the current flow from pulsating DC to continuous DC. The storage capacitors are often called filter capacitors since they "filter out" the DC pulses from reaching the downstream electronics. If there were no filter capacitors, an amplifier would make a very loud hum come from the speakers.
The storage capacitors are also needed to help the power transformer deliver enough peak current to reproduce dynamic peaks that require more current than the transformer can deliver. Think of the current that is required to drive the woofer at that moment when a bass drum is struck . . .
The current required by a Class B amplifier is directly proportional to the energy in the music. So at idle (no music), no current is needed or used. Very loud music will require an equally large amount of current to drive the speakers loudly.
It is this huge difference in current that causes the large voltage changes in the rails (the power supply output voltage) you find in most amplifiers. The difference in the rail voltage between idle and full power in most amplifiers is around 30%. This massive voltage drop causes the distortion, the bias, and the output capability of an amplifier to be modulated by the music.
An electronic circuit's distortion can only be optimized at a specific voltage. Any variation of voltage will result in increased distortion.
Class AB amplifiers are a bit more complicated than Class B amplifiers as they require a constant bias current that requires some power. The bias will be optimized at a specific rail voltage. Therefore, the bias will change directly with changes in the rail voltage.
But the biggest issue is that an amplifier's power will fall as its rail voltages fall. So unregulated amplifiers suffer significant performance degradation as the music modulates their power supply voltage.
The rail voltage fluctuations caused by amplifier load are only part of the problem. The mains voltage is not stable either.
The mains voltage will vary depending on the load on the power grid and the load on the house wiring. High load conditions can cause the mains voltage to vary by 10% or more.
For example, compare the electrical load and usage in the middle of the night to early evening on a hot summer day. At night people are sleeping so they are not using electrical equipment and the temperature is cool so air conditioners are not running much.
In the early evening, everybody is home from work, dinner is being cooked, electric washers and clothes dryers are operating, air conditioners are maxed out, people are using power-hungry electronics like big TVs, the lights are on, the water heater is running, etc. So the load on both the grid and home wiring is great.
And when do you listen to your music system? Of course, when power demand is the highest and voltage is the lowest. Murphy is hard at work here.
And there is even more bad news. The amplifier itself can severely tax the capacity of your house wiring to which it is attached. A powerful amplifier can draw all the power that is available from your wall receptacle, which is limited to about 2,400 watts on a 20 amp circuit. This will drop the voltage on that line by several percent -- this is in addition to the losses on the grid and in your home from other power uses.
Furthermore, the mains frequency has a big effect on the output of a power supply. This is because a transformer's power is determined by the current it can deliver in its power pulses multiplied by the frequency of those pulses.
This means that a transformer can deliver about 20% more power when operated on a 60 Hz mains than it can when operated on a 50 Hz mains. Therefore, an amplifier with an unregulated power supply will lose up to 20% of its power supply power when operated on a 50 Hz mains.
All this is further complicated by the fact that the relationship between voltage and power in an amplifier is not linear. Power varies by the square of the voltage.
Power is the product of volts times amps. Ohm's Law says that one volt will drive one amp through one Ohm of resistance. If you do the math, you will come to realize that the power of an amplifier is determined by the voltage that it can drive into the loudspeaker (assuming it can also deliver the current required).
The formula for calculating amplifier power is the amplifier's RMS output voltage squared and then divided by the speaker's impedance. As an aside, impedance and resistance are the same thing. Resistance applies to DC circuits while impedance is used for AC circuits. This is because the impedance often varies with frequency in AC circuits but there is no frequency in DC circuits. For calculations, you may use impedance and resistance the same way.
To determine the voltage, the formula is the square root of the product of watts times Ohms.
Using these formula, you can see that for an amplifier to drive 100 watts into an 8 Ohm speaker, it will have to produce 28.28 volts and deliver about 3.5 amps. Now what happens if we drop the power supply voltage by half? The voltage will then be 14.14 volts and the current will drop to 1.76 amps.
How much power will the amplifier now drive into the speakers? It will be just 25 watts. This is a huge loss.
So you can see that the typical 30% loss of rail voltage in an amplifier results in a very large loss of power -- about 50%. If you add an additional loss of mains voltage due to heavy house wire loading, you will lose another big chunk of power.
When you add all the above factors together, you can see that an amplifier's performance is severely degraded by power supply voltage fluctuations and that eliminating them will produce substantially better amplifier performance in terms of power, distortion, and optimum bias levels. So why don't amplifiers have voltage regulated power supplies?
The problem is that the poor efficiency of conventional voltage regulators results in vast amounts of waste heat. Most amplifiers run very hot and adding large amounts of waste heat to an already hot amplifier is intolerable. It is also expensive in terms of both hardware and electricity usage. So it is very rare indeed to find any amplifier that is fully voltage regulated.
So exactly how does a voltage regulator work and what makes it so wasteful and hot that using it is impractical? The most common type of voltage regulator is called a "down" regulator. This means that it pulls down the power supply's voltage so that it remains stable under the worst case conditions.
For example, all quality preamps are voltage regulated so that their power supply voltages will remain stable all the way down to a mains voltage of around 90 volts (using a 120 volt mains). Only if the mains falls below 90 volts ("brown-out" conditions) will the regulation be insufficient and the power supply voltage will start to fall.
The power supply will be driven by a 120 volt mains most of the time, although it might be up to perhaps 125 on occassion. The difference between 120 and 90 volts is about a 30%.
Let's assume that the preamplifier's electronics operate on 12 volts. The electronic engineer will design the power supply to deliver at least 30% more voltage than that (typically about 18 volts). He will then add a "down" regulator to pull the power supply voltage down to 12 volts, which is about the voltage that the power supply would produce using a 90 volt mains. So for any mains voltage between about 90 and 125, the preamp's power supply voltage will be stable at 12 volts.
The regulator actually works by placing a variable load across the power supply in the form of a power transistor that is shunted across the output. A power transistor can be thought of as a very fast-acting, variable resistor whose resistance can be changed electronically. By monitoring the rail voltage, the electronics can adjust the resistance of the transistor to alter the voltage.
As the mains voltage rises, the electronics will reduce the resistance of the loading transistor, which will draw more power and drop the power supply voltage. As the mains voltage falls, the electronics will increase the resistance of the load transistor, which will reduce the power used by the transistor and allow the voltage to rise.
Of course, the action of the electronics are nearly instantaneous, so there is no significant rise and fall of the rail voltages with changes in the mains. The voltage will remain rock stable to within a tiny fraction of a percent.
A down regulator is very inefficient. This is because it operates by feeding a voltage through a resistance. This causes a voltage drop by converting some of the power supply's current into waste heat.
Remember the above concept because it is extremely important. To repeat -- anytime you apply voltage across a resistance, there will be a voltage drop. The loss of current causing the voltage drop will result in waste heat.
The circuitry in a preamp uses only a tiny fraction of an amp (typically just a few milliamps). So the power involved will only be a fraction of a watt or so.
If you waste 30% of a watt in a voltage regulator, the heat produced and the electricity wasted is insignificant. So nobody cares about the efficiency of down regulators when used in small-signal devices.
But now let's look at power amplifiers. Just how much power do we need to regulate?
The typical Class AB amplifier is about 50% efficient. Why? Because it applies its power supply voltage to its output transistors and these act as variable resistors that control the voltage being applied to the speaker. So once again, we have the issue of producing waste heat because we applied voltage across a resistance.
This means that for every watt that the amplifier feeds to the speaker, a watt will be injected as heat into its heat sinks, and two watts will be drawn from the mains. A powerful amplifier like the Magtech will produce 500 watts per channel into 8 Ohms. With both channels operating at full power, 1,000 watts will be fed to the speakers. It also means that about 1,000 watts of waste heat will be fed into the heat sinks, and 2,000 watts will be drawn from the mains.
The Magtech's power supply will produce 2,000 watts continuously, so a regulator must be able to control a minimum of 2,000 watts of power (and more to be conservative). The regulator must be able to regulate at least 30% of the rail voltage in order to eliminate fluctuations in voltage due to the variable music demands. In addition, it must be able to handle more than that to account for voltage variations in the mains and 50 Hz operation.
All together, we are looking at regulating about half of the power supply's voltage. This is a daunting task for a down regulator because it means that under worst-case conditions (maximum mains voltage, 60 Hz mains, and with the amp at idle), the regulator will have to dissipate half the power supply's voltage (and hence half its power) as waste heat.
That means that the regulator would produce 1,000 watts of waste heat. This would turn the amp into a room heater and require truly massive heat sinks. It would waste enormous amounts of electricity, be very large, and the heat would cause failures of parts over time. You should now be developing an appreciation of why amplifier power supplies are not regulated!
Although down regulators are very simple and easy to add to circuits, they are just not practical for use in high power circuits due to their inefficiency. But there are other types of regulators, which are more efficient. These are the "up" regulators.
An up regulator requires two power supplies. These have different voltages where one is set for the worst case voltage and the other is set for the best case voltage (say 120 volts and 90 volts for example).
The two power supplies are connected together by a power transistor whose resistance can be varied to allow more or less of the high voltage power supply to be added to the low voltage one. This allows the high voltage supply to bring the voltage "up" and prevent it from falling based on load or mains voltage. The rail voltage can therefore be kept constant between the two extremes by electronically controlling the coupling transistor.
The big advantage of an up regulator is that it only has to handle a percentage of the total power supply voltage (in this example, 30%) instead of all of it. Therefore the losses and waste heat are only a fraction of those produced by a down regulator. But it requires two power supplies, is more complex, and more expensive than a down regulator.
An up regulator still wastes far too much power and produces too much waste heat. So it is still impractical for use in all but very low-power amplifiers.
The next general type of regulator is not a linear regulator like the types I have been describing. It is the switching regulator.
A switching regulator is rather complex, but I'll simplify its operation for clarity. A switcher fundamentally places a transistor in series with the output from the power supply. This transistor is then switched on and off at a high frequency to feed power to the electronics.
The transistor oscillates at a fixed frequency and its "on" time is varied so that it feeds a percentage of the power supply's current to the electronics based on their need. By feeding a capacitor bank, a switcher can adjust the current flow to produce a stable voltage.
Switching power supplies are very efficient (although not 100%) because their transistors are used in only the on or off state. They are not partially turned on like the transistors in linear supplies where a significant resistance is presented to the voltage that produces waste heat.
However, a transistor does not change state instantly. There is still a small percentage of switching time during which the transistors are changing state and resistance is present. So they still produce some waste heat, although this is relatively small, can be tolerated, and therefore switching power supplies can successfully be used in power amplifiers.
But there are big problems when using switching power supplies in high power applications. The main one is noise -- both electrical and mechanical. When switching high power and voltages at high frequencies, radio frequencies are produced. These emissions can adversely affect associated audio electronics and cause instability, oscillation, noise, and general misbehavior.
Powerful switchers also make mechanical noise because there is physical vibration of the switching transistors due to the high currents involved. Switching power supplies are vastly more complex than a simple, 3-part, linear supply and therefore the reliability of switching supplies can be a problem.
There are also many technical problems when designing switching power supplies that make them quite difficult to make work satisfactorily. I won't get into any more detail about this, but rather simply point out that because of all the problems, it is extremely rare to find a linear amplifier with a switching power supply. They do exist, but are not 100% efficient and are not a practical solution for the voltage regulator problem in power amplifiers.
Of course, I have just outlined the basics at this point. There are many variations on the theme that are beyond the scope of this paper. But you should now have enough information to appreciate the solutions that follow.
So how can the efficiency problem of high power, regulated power supplies be solved? Well, the answer came from thinking outside the box. Specifically, since the heat is produced by applying a voltage to a resistance, the solution had to come from figuring out some way to eliminate doing so.
There is a way. But it could not be done by regulating the continuous DC from the output of a power supply's capacitors because voltage is always present there. The solution had to be done by figuring out a way of regulating without having voltage present. That sounds crazy and impossible, but it can be done.
What about the output from the rectifiers? This is pulsating DC. While the peak of each pulse is at high voltage and power, the voltage at the end and beginning of each pulse is at -- ZERO!
If the regulating power transistor operated only when the voltage was at zero, then there would be no current present, none would be wasted, and no waste heat would be generated. But how can that regulate the voltage? Here's how:
The Magtech uses two power supplies as you would in an up regulator. I call the low voltage one the "ride" supply. It is exactly like the power supply in a conventional, unregulated amplifier.
The second power supply is the "boost" power supply. It has a higher voltage and current rating than the ride supply and can add massively more power to the ride supply when needed.
The ride supply voltage is set for "easy" operation under optimum conditions, i.e., when the mains voltage is at maximum and the amp is at idle. Under these conditions, only the ride supply drives the amplifier circuitry and the boost supply is just on standby.
Note that for the Magtech amp, this is the "easy" condition when the regulator does nothing. By comparison, this is the toughest condition for a down regulator because it has to drag down the power to the worst case level and dissapate massive amounts of power and heat when doing so.
But in the Magtech, this is the voltage that is desired and that the regulator will maintain. Under these easy conditions, the boost supply is not needed.
When significant power is required, the rail voltages will start to fall. This is detected by the power supply's monitoring circuitry, which then switches on the coupling transistors to connect the boost supply to the ride supply. The additional power provided by the boost supply prevents the rail voltage from falling, thereby regulating it.
Secondly, digital control circuitry is used to monitor the rectifiers' wave form and cause the transistors to switch states (either on or off) at the exact point where the DC pulses cross the zero voltage point. This is important because even though transistors change states very quickly, they do not do so instantaneously. So there is some resistance during the change of state. This is the same problem that causes switching power supplies to be less than perfectly efficient.
If the transistors changed state while the power supply voltage was applied to them, there would be waste heat generated. By only allowing state changes at the zero voltage points, there is no waste heat.
Now if you are observant and thoughtful, you might comment that this does not sound like a very good regulation scheme because the two power supplies are either at maximum voltage or minimum voltage because the regulator operates as an all-or-nothing affair. Your thinking is good, but you are overlooking an important feature in the Magtech's power supply.
The digital control circuitry constantly monitors the pulsating waves from the regulator and the rail voltages. It will then make a decision to turn the coupling transistors on or off at each zero point to add as many or few pulses as required to hold the voltage constant.
Under heavy load, the coupling transistors would remain on (possibly even continually), letting most or all of the pulses through. Under light load, they would only be switched on occasionally to let a few pulses through.
While it is true that the regulator has a maximum resolution of 120 pulses per second, each pulse has to charge up a very large bank of capacitors (80,000 uF). Doing so takes time and much current. Therefore, even though each pulse has a lot of current and energy, it can only make a very small change in the capacitor bank's voltage.
The electricity and voltage in the capacitors are analogous to the water in a swimming pool. You can dump a large, 55 gallon drum of water into the pool (a pulse from the boost power supply), but it won't change the level of the water in the pool (the voltage in the capacitors) very much.
By adding more or less pulses as needed, the regulator can maintain a stable voltage to within 0.2 volts. By comparison, without the regulator, the power supply's voltage would vary by more than 50 volts. Which would you prefer?
You can now see why the Magtech regulator produces no heat and is virtually 100% efficient. Technically, I can't claim that the regulator is absolutely 100% efficient because nothing is perfect and there is a very tiny amount of resistance in everything, including the coupling transistors when they are "on."
But the resistance of the transistors is less than one Ohm, so they still do not get even warm when operating. Furthermore, they only operate when the amplifier is working fairly hard, so the regulator isn't even active when the amplifier is at idle or at very low power.
In some ways, the Magtech's power supply is like a switcher in that its transistors are either on or off. But there is no specific oscillation involved as in a switcher. Also, it is relatively simple and operates very little and at low frequencies, so its reliability is outstanding (no failures have ever occurred). And because it never switches under power, there is no noise or radio frequency problems with it.
In short, the Magtech's power supply is unique and solves all the problems of other regulators that have prevented power amplifiers from being regulated -- something they badly need even more than other types of electronics. The Magtech regulator's circuit, and particularly the digital control technology involved is the subject of a patent, which currently is pending.
The Magtech amplifier modules are the same sophisticated ones used in the ESL amp that are capable of very high power, the ability to drive 1/3 Ohm loads, can handle the most difficult loads (as presented by electrostatic speakers), and need no protective circuitry that ruins the sound of many solid state amps.
When the ESL amplifier modules are combined with a practical voltage regulator, the result is an amplifier with seemingly unlimited power, virtually unmeasurable distortion, and the ability to drive even the most difficult loudspeakers with ease. The Magtech offers a truly new level of performance in amplifiers.