Audio Levels - Part 4
There are two basic reasons to know the level of an audio signal. One of these is more technical and one of them is more subjective.
Considering first the technical reason, knowledge of the signal level helps to make sure the sound is not distorted by excessive level or prone to noise due to inadequate level. In radio, excess level can also cause interference. The more subjective reason that is of interest to broadcasters is to know how loud program material appears to the listener.
Obviously the second of these requirements cannot be objective, but in fact the first one can’t either. That is because the audibility of problems due to high levels is not a simple matter. And if it’s not audible, it’s not a problem. It follows that no level meter can ignore the characteristics of the human auditory system (HAS), or to put it another way, the ones that do ignore it may not be a lot of use.
It will be necessary to take a small digression into human hearing before level metering makes sense. Whilst we may use sine waves for the purpose of lining up audio systems or frequency response checking, it is important to understand that a sine wave is used because it is easy and simple to generate. However, a sine wave is totally unrepresentative of an audio signal and to make any progress in audio quality it is necessary to consider what waveforms actually exist in audio and why. A sine wave has constant amplitude, whereas real audio is characterized by constantly varying amplitude and a high peak to mean ratio.
Hearing in all forms of life is a survival mechanism as it allows threats to be recognised, located and avoided and it allows sources of food to be located. These mechanisms all take place in the time domain. Everyday sounds such as footfalls, standing on a twig and breaking it, knocking on a door, are impulsive and cannot be placed on a musical scale. However, being acoustic events, they can be placed on a time scale.
With two ears, spaced apart, living things can determine direction in sound because the time of arrival of events will be different at each ear. Events contain information, whereas sine waves do not. A pure sine wave has no bandwidth and carries no information. Play a line-up tone through loudspeakers and there is no clue where it is coming from because the sound jumps to the nearest standing wave mode of the room. All cycles of a sine wave look the same so there is no unambiguous time of arrival difference.
The ear when working in the time domain can also estimate the size of a sound source from the time constants of the events. Record the sound of a howitzer firing and speed it up and it sounds like a handgun.
Much later, the HAS evolved the ability to discriminate pitch as well as time, but it can only do one or the other, not both at the same time. It switches mode.
If one considers harmonic distortion, that takes place in the frequency domain. Non-linearity, perhaps due to clipping, causes new frequencies to appear in the spectrum that are multiples of the fundamental. In order to hear harmonic distortion, there has to be a fundamental, and an acoustic event may not have one. A single transient doesn’t have a pitch. Spectrum analysis of a transient shows all frequencies to be present, so any new ones due to distortion will be masked.
Fig.1 - The ear is resonant and takes time to build up response to a new sound. As shown here, sounds of less than 100millisec duration appear quieter than if they were continuous. Under reading ears need under reading meters.
Another consideration is that when the ear works in the frequency domain to estimate pitch, the mechanism is resonant and so takes time to build up and to decay. The response of the HAS to tones is therefore a function of their duration as Fig.1 shows.
Where this is leading is that distortion has to be sustained and suffered by a pitched sound before it can be heard. This is the mechanism that allowed transients to be recorded in the headroom of analog magnetic tape, giving a useful increase in dynamic range. In order to do that, the level meters have to mimic the HAS and not react to a transient too quickly.
This was just as well, because, barring the oscilloscope, the fastest type of meter at that time was a moving coil meter of the d’Arsonval type and even that had a certain amount of inertia. For some reason the dynamics of the audio level meter came to be called ballistics. As that word is generally considered to describe the dynamics of projectiles, primarily used as weapons, it seems a strange choice.
The peak program meter (PPM) in its original form had a rise time of 10 milliseconds in the case of a BBC PPM. As transients can be positive or negative, a full-wave rectifier preceded the meter. The finite rise time of the PPM causes it to under-read short impulses. Fig.2 shows the characteristics of the BBC PPM where, for example the meter under-reads by 4dB for a 5 millisecond tone burst.
The under-reading of the PPM is not a failing; it is deliberate. Fig.1 shows that the ear under responds by about 17dB to a 5 millisecond tone burst, so the meter is well ahead of the HAS. In a correctly calibrated system having some headroom, typically 8 or 9dB, if the meter doesn’t show an excessive level, the ear won’t hear it.
The other defining characteristic of the PPM is the artificially slow decay time, the BBC PPM taking 2.8 seconds to fall 24dB, which means that the existence of a peak will be visible for some time afterwards. There was no acoustic input to that time constant, rather it was an ergonomic consideration.
The PPM worked well but was expensive and a cheaper alternative was sought. In the standard volume indicator, the rapid rise time of the PPM was abandoned so that an ordinary meter movement could be used. The rise time was specified at 300 milliseconds so that the significant under-reading of peaks would at least be consistent. Again for economy there was no peak holding action and the fall time was the same as the rise time.
Fig.2 - The characteristic of a BBC Peak Program Meter which under-reads on short duration sounds, but not as much as human hearing does.
The full wave rectifier was a copper oxide device, an early semiconductor, having a forward voltage of about 0.2 Volts, or twice that when used in a bridge rectifier. The characteristic curve of the rectifier was used to give the meter a non-linear characteristic, which along with a non-linear scale, allowed it to have a somewhat logarithmic characteristic.
The scale was calibrated in volume units and the device became known as a VU meter. Having decided on the cut-price approach, efforts were made to justify it from psycho-acoustic principles. None of these explanations is very convincing. It works beautifully on sine waves, but on real audio it under reads and needs to be interpreted rather than believed, a bit like a politician.
Two key factors changed the level metering landscape permanently. The march of optoelectronics obsoleted the d’Arsonval meter and today level meters consist of bargraphs that have no inertia. A row of light emitting diodes driven by circuitry that can have any desired dynamic characteristic for very little cost, or some IT based graphic display may be used. Once digital audio equipment of adequate word length became available, the need to keep levels up to stay above noise largely went away. These factors eliminated any need for the finite rise time of the PPM as well removing the economic advantage of the VU.
With the advent of digital audio and the dB(Fs) it was relatively easy to construct a level meter that had an attack time of one sample period. The two’s complement PCM samples are rectified by examination of the sign bit. If the sample is negative, all the bits are inverted and one is added. The rectified values can be converted to any desired scale and the fall time of the PPM can be retained. When the attack time is one sample, the meters cannot under read the sample values and there is then no theoretical need to use headroom. However, if a finite rise time is retained, the meter will under-read and headroom will still be necessary.
When PCM adaptors such as the PCM-1610 were in use it was easy to obtain a superb level meter. One simply connected a video monitor to the pseudo video signal carrying the audio samples and looked at the amount of sign extension in the bit patterns on the screen. Decca’s PCM recorders simply displayed the state of the sign extension on a series of LEDS, giving an accurate level display in 6db steps. In both cases there was no scale or any calibration marks in sight.
Whilst that was fine for making digital recordings, it was not enough for the broadcaster whose output had to have balanced level from one program to the next. What was actually needed was a loudness meter.
You might also like...
HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows
Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…
IP Security For Broadcasters: Part 4 - MACsec Explained
IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.
IP Security For Broadcasters: Part 3 - IPsec Explained
One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…
The Resolution Revolution
We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?
Microphones: Part 3 - Human Auditory System
To get the best out of a microphone it is important to understand how it differs from the human ear.