Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.


This series of articles was originally published in 2021. It was very well read at the time and has continued to draw visitors, so we are re-publishing it for those who may have missed it first time around.

There are 11 articles in the series:


As the reproduction of a point image from a pair of loudspeakers requires the left and right signals to have the same waveform, but to differ in amplitude as a function of direction, it follows that the two microphones needed must be in the same place. The technical term used is coincident.

Such signals also have reasonably good mono compatibility. When the left and right signals are added together, the fact that the two waveforms due to a given source differ only in amplitude means that they add coherently. Mono compatibility has traditionally been significant to broadcasters who have no control over the listener's equipment.

It is important to appreciate that when a stereo audio signal is converted to mono by adding the channels, all of the directional information is lost and all sounds in the mono signal will be reproduced from the same place. This means that the listener can no longer use attentional selectivity (also known as the cocktail party effect) to differentiate between the sound sources. A rich stereophonic image that sounds great may become incomprehensible in mono.

Fig.1 - The classic crossed figure-of-eight or Blumlein microphone configuration, which produces a pair of signals that will directly deliver a stereophonic image between a pair of loudspeakers.

Fig.1 - The classic crossed figure-of-eight or Blumlein microphone configuration, which produces a pair of signals that will directly deliver a stereophonic image between a pair of loudspeakers.

A smaller effect is that conversion to mono also makes the sound drier as the addition of the channels effectively narrows the acceptance angle as well as cancelling out any anti-phase components.

If the microphones are required to be coincident, then the only way to obtain different signals is if they are a) directional and b) pointing in different directions.

Fig.1 shows one possible arrangement, in which a pair of figure-of-eight microphones is mounted at 90 degrees to one another. This configuration is often called a Blumlein pair after Alan Blumlein.

It is important to grasp how the sound received by the microphone is mapped into the sound image presented to the listener. With reference to Fig.1 it will be seen what happens. Sound sources can only be reproduced between the speakers. A central sound source is equally off axis to both microphones and so the left and right signals will be identical and a virtual sound source will appear half way between the speakers.

If the sound source moves to the left, it will be closer to the axis of the left microphone and further from the axis of the right microphone. The left signal will be bigger than the right signal and the virtual sound source will move to the left. This mechanism takes place over the whole of the front acceptance angle.

One of the fundamental requirements of audio systems is that they should be linear. In mono, this means that any number of different sounds can be conveyed simultaneously. In stereo it means that an unlimited number of different sound sources in different places can be reproduced simultaneously.

When the original sound source is at 45 degrees, one of the microphones will produce no output (in anechoic conditions at least) and all the sound due to that source will be emitted by one speaker. The edge of the acceptance angle has been reached.

If the sound source is more than 45 degrees off axis, the two microphones will work in the same way as for a forward source, except that the left and right signals will be out of phase. Fig. 2 shows that the sounds from the anti-phase regions are mapped on to the frontal display between the speakers.

Fig.2 - The sound captured by the crossed figure-of-eight microphone is mapped onto the stereo image between the speakers as shown here. The circle surrounding the microphone shown in a) is folded along the A and B axes. The four quadrants of the circle are superimposed in the image between the speakers as b) shows.

Fig.2 - The sound captured by the crossed figure-of-eight microphone is mapped onto the stereo image between the speakers as shown here. The circle surrounding the microphone shown in a) is folded along the A and B axes. The four quadrants of the circle are superimposed in the image between the speakers as b) shows.

The fact that the sound in these regions is anti-phase appears catastrophic, but things are not as bad as they seem. Polarity only has meaning in the context of sustained sine waves, and these are rare in audio. Real sound consists of events that occur at specific times and a phase reversal does not change the time at which an event takes place in the slightest.

All that is necessary is to arrange things so that no important sound source, such as an instrument or a person, is in the anti-phase region. That region will then capture ambience and reverberation and the fact that it is out of phase is practically impossible to tell, although an audio vectorscope would reveal it as a widening of the trace.

The rear region of the microphone is once more in-phase but the image is mirrored as Fig.2 shows. Rear-left sounds are mapped to the front right. In a typical concert hall, rear sound predominantly consists of reverberation and it is simply not possible to tell if it has been mirrored.

The important factor for stereo imaging is that the microphones should be coincident. That means it is possible to use a wide variety of polar diagrams provided the left and right microphones are the same. The optimal angle between the microphones is dependent on the directivity. Fig. 3 shows the crossed cardioid configuration. The broad front lobe of the cardioid results in the microphone having a wider front acceptance angle.

The gentle curvature of the cardioid directivity pattern means that there is some latitude available in selecting the angle between the microphones. 

The anti-phase regions subtend about the same angle as in the crossed-8, whereas the rear region is physically much smaller as well as being reproduced at lower level.

In mono, the cardioid microphone produces a somewhat drier sound than the omni or figure-8, whereas when used as a coincident pair the wide acceptance angle of crossed cardioids somewhat compensates for that characteristic.

Fig.3 - The crossed cardioid configuration simultaneously broadens the forward acceptance angle and narrows the rear region.

Fig.3 - The crossed cardioid configuration simultaneously broadens the forward acceptance angle and narrows the rear region.

Directional microphones such as hypercardioids need to be mounted with a smaller included angle to prevent central sound sources appearing to fall in level. Such configurations need to be used with care as they have relatively small acceptance angles and if brought too close to a sound source will exaggerate any movement of the sound image. Under difficult conditions it may be better to resort to a mono directional microphone steered with a pan pot.

The directional characteristics of the microphones that are used in stereo alter the mapping between the original sound and the stereo signals. However, the type of microphone in use does not alter the fact that all of the direct sound reaching the listener in the virtual sound image must arrive within the angle subtended by the speakers. Sound reaching the listener from other directions can only be due to reflections in the listening space.

Such reflections appear to have a split personality because some people claim that they disturb the main image and need to be absorbed whereas others claim that the reflection make the sound more natural. In fact both can be right depending on the circumstances.

As is well known, the imaging information in sound is carried in the time domain and the HAS works in the time domain to deal with it. A typical sound source such as a musical instrument emits direct sound to the listener and off-axis sound that results in a reflection. On account of the greater path length, the reflected sound arrives later.

The Haas effect recognizes that the later sound strongly resembles the direct sound reflection and discounts it as a source of direction. The first version of the sound determines the perceived direction and the reflection does not. By compensating for the delay, the HAS can add the direct and reflected sounds so they can better be heard.

If a loudspeaker replaces the real sound source, the same thing should happen, but in most cases it doesn't. The reason is that most legacy loudspeakers are designed so that the sound they emit is only accurate over a very small angle at the front. The reflected sound may then be so poor in quality that the HAS does not recognize it as a reflection but instead believes it to be a different source. Then, of course the true stereo image will be disturbed by false images.

That is why a lot of stereo monitoring is performed in near-anechoic conditions, so that the poor quality off axis sound will be absorbed. Such monitoring is actually an admission that the loudspeakers are poor.

On the other hand if the loudspeakers are designed to more modern criteria, which hold that the sound quality of the off axis sound should be the same as that of the on-axis sound, then the Haas effect recognizes the reflections for what they are and there is no damage to the sound image. Anechoic conditions are then neither necessary nor desirable and the cost of acoustic treatment is reduced.

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

The Resolution Revolution

We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?

Microphones: Part 3 - Human Auditory System

To get the best out of a microphone it is important to understand how it differs from the human ear.

HDR Picture Fundamentals: Camera Technology

Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.