Information: Part 6 - Loudspeakers

Information theory can also be applied to loudspeakers, which are among the most difficult of transducers to design. Measuring the information capacity of loudspeakers is a useful tool.

Other articles in this series:

Except for special cases, the loudspeaker is intended to excite the human auditory system (HAS). If we assume an audio signal that is essentially blameless, we can concentrate on the rest of the information path that must include not only the loudspeaker but also the HAS. Given that the information capacity of the HAS is finite, there is little point in having loudspeakers that exceed the capacity of the HAS. In practice, primarily for economic reasons, loudspeakers usually fail to reach the information capacity of the HAS, often by a depressing amount.

Fig.1 shows the effect of raising the quality of an audio system on the listener. Initially, as the quality increases, so does the subjective effect. But there must come a point where the shortcomings of the HAS exceed the shortcomings of the audio system, and then the subjective quality must level off. The level must be subject to some variation, as some listeners are more critical than others.

In some cases the audio system is merely acting as a reminder or a souvenir of the original sound. The quality can be remarkably low but the reminding function still works. In the same way that the die cast miniature Eiffel Tower does not represent the original very well, it still serves to remind the owner of that trip to Paris.

Fig.1 - As the information content of reproduced sound increases, initially the improvement is heard, but when the information capacity of the human auditory system is reached, further improvement is pointless as the auditory sensation levels off.

It is only possible to make progress with any endeavor if the results can be measured and compared with some goal. In recent years the amount of knowledge regarding the operation of the HAS has burgeoned and the diligent reader can readily establish the criteria which a good loudspeaker should meet.

Unfortunately the effect of this increased psychoacoustic knowledge on loudspeaker design has been minimal and design techniques and architectures that are now known to be deficient continue in service for economic reasons or out of pure inertia. Such knowledge also appears not to have reached the average hi-fi enthusiast, who can still be gulled into parting with huge sums in return for very little. Sale of over priced audio equipment appears to have taken the place of the potions sold by quacks in a travelling circus.

In loudspeakers, the topic of frequency response is still considered to be about the only parameter of importance, when there are other, equally important, characteristics that are largely ignored. The equivalent in television is the obsession with pixel count.

The Cinderella subjects in loudspeakers are the time or phase response and the directivity. Most modern loudspeakers have acceptable frequency response, yet different designs having the same frequency response can sound completely different. If things measure the same but act differently, the only possible conclusion is that the measurement is inadequate.

Most loudspeakers retain primitive crossover architectures that divide the incoming audio spectrum into two or more parts with no real hope of joining them together in any way that retains the original waveform. Such techniques must lose information.

In a corollary of Fig.1, consider Fig.2, which is the result of the subjective test of a loudspeaker in which the information content of the test signal is reduced from some arbitrarily high level. At first, the listener hears no reduction in quality, because the limiting factor is the loudspeaker, but eventually the limiting factor becomes the input signal. The point at which that happens is the point where the degradation due to the loudspeaker and the degradation due to the information reduction are the same. In other words the information capacity of the loudspeaker has been measured.

Although there are other possibilities, one way in which a test signal of variable information capacity can be created is to use an audio codec intended for bit rate reduction. The compression factor imposed on the codec defines the quality.

Fig.2 - When the signal source quality is steadily reduced from a high standard, at some point the sound quality from the loudspeakers will be heard to deteriorate. The information capacity of the speakers has been measured.

Fig.3 shows an audio codec in series with a loudspeaker. The listener can only assess the overall result. Is the speaker testing the codec, or is the codec testing the loudspeaker? It's a good question and there is no doubt that inferior codecs and or inadequate bit rates have entered service because they were auditioned on mediocre legacy loudspeakers that masked the compression artifacts with their own shortcomings.

The difficulty that all audio engineers have is that the signals they handle cannot be heard without loudspeakers. If the information capacity of the loudspeakers is limited, the ability to assess the quality of signals is impaired. Poor monitoring loudspeakers place a bound on the quality that can be achieved, because defects that cannot be heard cannot be rectified.

Another difficulty is that people grow accustomed to the loudspeakers they normally use and tend to treat them as correct, so that a more accurate loudspeaker will then be judged different and by implication incorrect.

Fig.3 - An audio codec in series with a loudspeaker. Both are capable of reducing the information in the signal. Which one is dominant? Is the speaker testing the codec or vice versa?

Like video, stereophonic sound is intended to convey an image to the listener. unlike video, there are no agreed ways of testing the quality of that image and as a result progress in that area has been faltering. The two loudspeakers concerned need to act as point sources and if they do not, and instead act as sources of finite size, the sound image will be smeared just as television pictures are smeared by motion.

A large number of legacy loudspeakers are made, for simplicity, from six pieces of wood. Inevitably this prismatic construction leads to sharp corners, at which the acoustic impedance changes. As with transmission lines, impedance changes cause reflections and the result is that significant amounts of sound are radiated from the corners of the enclosure. The direct sound and the radiated sound interfere with one another and serve to put ripples in the directivity function, which are a clue that the speaker is not a point source and will suffer image smear.

One of the earliest consequences of poor loudspeaker imaging was the adoption of spaced microphone techniques for stereo. The reason was that they were not covered by Blumlein's patents and so could be used without payment of royalties. If the imaging of the associated loudspeakers was poor enough, the technique did not seem much inferior to the coincident microphone.

Real live sound consists of direct sound, reflections and reverberation, also called ambience or air. In a stereophonic image, the location of sound sources can be established, along with reflections. Reverberation fills in the spaces. As an aside, if the sound sources are positioned using pan pots, there will be no reverberation and that will have artificially to be added.

Fig.4a) shows an original sound stage, whereas Fig.4b) shows the sound stage from a legacy loudspeaker that suffers image smear. The size of the sound sources and reflections has been increased, so that the amount of reverberation that can be heard is reduced.

Fig.4 - At a) an accurate sound stage shows sound sources having their true size, reflections and reverberation. At b) a badly reproduced sound stage suffering from image smear tends to lack reverberation because the sound sources have enlarged and caused it to be masked.

By coincidence, audio codecs also tend to reduce reverberation on the grounds that it is masked by the dominant signal. In mono that is true, but in stereo it is not. This is hardly surprising, since information theory predicts it. A mono audio signal does not contain a sound stage image and so it cannot be degraded. A stereophonic audio signal contains more information than two monophonic channels, since it also carries an image. The two channels of a stereo system must have higher resolution than mono channels because some aspects of the image are carried in very small differences between the two channels. If the channels are not accurate, the differences between them are impaired.

If Fig.4 is considered in that context, the original sound stage at a) has space for the reverberation and it would be obvious if the reverberation was missing or inadequate. However, the sound stage at b) conceals the reverberation so it is harder to tell if it is reduced or missing. It follows that compression codecs cannot be properly assessed on mediocre loudspeakers, nevertheless that is exactly what has happened on a number of occasions. Systems were put into service that sounded fine on the loudspeakers used for testing. Unfortunately it was not realized that the codec was testing the speakers and when heard on better monitoring the problems were audible.

A lot of early audio compression codecs were simply not very good, but possibly the most dangerous feature of audio codecs is that there is complete freedom in the choice of output bit rate and economic pressure to reduce it. The combination of a crude codec and an inadequate bit rate led to depressing results. A striking example was the UK implementation of digital audio broadcasting, (DAB) which was claimed by the advertising to offer CD quality. Those claims had to be withdrawn when the digital channels were found to be obviously inferior to the same program carried on FM.

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

Standards: Part 23 - Media Types Vs MIME Types

Media Types describe the container and content format when delivering media over a network. Historically they were described as MIME Types.

Six Considerations For Transitioning To Cloud Based Video Distribution

There are many reasons why companies are transitioning from legacy video distribution workflows to ones hosted entirely in the public cloud, but it’s not a simple process and takes an enormous amount of planning. Many potential pitfalls can be a…

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…