Digital Audio: Part 10 - Adjusting Levels

Gain control in digital audio is essentially a numerical model of the same process in the analog domain.

Audio gain control operates with respect to zero in the middle of the signal gamut. In the analog domain, gain requires the input waveform to be multiplied by a constant factor using, for example, an operational amplifier. In the digital domain there is also a multiplication, but it is between the input sample values and a factor known as a coefficient.

As will be seen, the use of two's complement encoding also allows accurate multiplication of sample values, both positive and negative, by coefficients. In principle there is not much difference between the operation of a digital filter and a digital audio mixer as both require multiplication by various coefficients. In filters, the coefficients are usually expressed in a linear format, whereas in volume controls, the logarithmic level-sensing characteristic of human hearing requires a similar relationship between the control and the amount of gain.

In a traditional analog implementation, a potentiometer or fader intended for use as a volume control would have a logarithmic taper: the resistance of the track would be non-linear so that the output level would be a logarithmic function of the position of the knob. In practice there is a departure from the logarithmic law at the lower end of the track, where the attenuation in dB becomes minus infinity, corresponding to a gain of zero, so the sound can be faded out completely. In the digital domain that mechanism must be replicated for use in audio faders.

The knobs and faders of an analog mixing console are replaced by encoders; devices that convert movement into digital codes. The coefficients supplied to the multiplier must have a logarithmic relationship to the position of the control. It may be useful to remember that a doubling of audio level corresponds to a gain of very nearly 6dB. A binary coefficient is doubled shifting it one place left. 

Fig.1 An optical absolute encoder interrupts light beams to produce a coded output of the fader position.

Fig.1 An optical absolute encoder interrupts light beams to produce a coded output of the fader position.

There are two basic types of encoder, absolute and relative. In the first, typically used for linear faders, the data output corresponds to the actual position of the control. In the second, the output is in the form of pulses that occur only when the control is moved. The absolute position of the control is not known and such encoders are typically rotary and have no end stops.

In automated mixing consoles, an absolute encoder has to be motorized so that the position can be recalled and that makes it an expensive item. The relative encoder does not need to be motorized. Instead a small display adjacent to the control shows the setting that can be increased or decreased using the encoder. A desk with recall simply updates the display with no need to move the control.

More than one parameter can be adjusted with a single encoder. The encoder is simply assigned to the parameter to be changed using a menu or by touching a button.

Encoders often use optical sensing, as they then have no wear mechanism. Fig.1 shows that in an absolute encoder the fader moves a patterned track through a series of light beams, one for each bit, creating a different bit pattern for every location. Such encoders cannot use pure binary, because as Fig.2a) shows, they would then be prone to spurious incorrect codes at certain transitions where two or more bits would ideally change at the same time but in practice do not because of tolerances.

Fig.2 - a) Tolerances cause false codes to be generated between valid codes, meaning pure binary cannot be used. Instead, a Gray code shown at b) is used, in which only one bit (arrowed) changes between successive codes, so no dynamic hazards can occur.

Fig.2 - a) Tolerances cause false codes to be generated between valid codes, meaning pure binary cannot be used. Instead, a Gray code shown at b) is used, in which only one bit (arrowed) changes between successive codes, so no dynamic hazards can occur.

The solution is to use a Gray code, named after Frank Gray, in which adjacent codes always have a Hamming distance of one, which means that only one bit ever changes between codes and dynamic hazards cannot occur. An example of a Gray code is shown in Fig.2b). The Gray Code then needs to be converted to binary using, for example, a look-up table.

In the incremental encoder there are gratings carrying radial patterns. However, the markings on the fixed and moving gratings are not perfectly parallel and the result is that when the control is turned moiré fringes move radially with respect to the shaft. These can be detected with optical sensors. The angle though which the control has been turned can be determined by counting the pulses. Two light paths suitably arranged produce a pair of signals in quadrature, as shown in Fig.3, and the phase relationship between them allows the direction to be established.

The two's complement coding scheme of digital audio allows positive and negative numbers correctly to be added. Multiplication is only repeated addition, so two's complement must also allow accurate multiplication to be performed. This is subject to some cautions. Multiplication causes word length extension. When two integers are multiplied together, the result is, of course, bigger than either of them.

Fig. 3 - In a rotary encoder there are two outputs in quadrature so the direction of rotation can be determined.

Fig. 3 - In a rotary encoder there are two outputs in quadrature so the direction of rotation can be determined.

However, word length extension still takes place even if samples are attenuated, because multiplying by less than one causes low order bits to come into being. As a result the internal word length of a digital audio process will have to be significantly greater than the word length of the inputs. If the ultimate output must have the same word length as the inputs, then provision must be made for reducing word length in an optimal manner.

Fig.4 shows an example of a two's complement multiplication. In binary, the bits of the coefficient, which is the multiplier, each represent different powers of two. The value of 101, which is 5 in binary, reveals that multiplying any number by five is equivalent of multiplying that number by four and then adding the number. The multiplication by four is simply a matter of shifting the number two places left.

As integer multiplication increases the magnitude of numbers, it is necessary to make the number scale larger. In the case of two's complement samples, this is done adding high order bits to the left of the MSB using a process called sign extension. For a positive number that means adding leading zeros. For a negative number leading ones are added. Sign extension doesn't change the values of samples at all; it just puts them in the center of a larger number range.

The gain coefficient is typically a pure binary, or positive only, number. In the example of Fig.4, two four-bit numbers are to be multiplied. This requires the multiplicand to be sign-extended to eight bits. Conceptually, the multiplicand is then shifted one, two and three places, giving multiplications by two, four and eight. The products to be added together are selected by the bits of the multiplier. As shown, if the bit is a one, the product is added. If the bit is a zero it is not added. As the products are all in two's complement notation, the multiplication works equally well for positive and negative numbers.

In the case of attenuation, the multiplier is less than one, which means it has fractional bits to the right of the radix point, representing one half, one quarter and so on. In the case of 3dB attenuation, the coefficient would be the reciprocal of the square root of two, or 0.707. An eight-bit coefficient would have 256 combinations and 0.707 would be expressed as 181/256. Fig.5 shows that the fractional coefficient of 0.707 contains bits below the radix point.

Fig.4 - An example of multiplication in two’s complement.  Following sign extension, one of the numbers is shifted one, two and three places to multiply it by two, four and eight. The other number determines which of these products is added.

Fig.4 - An example of multiplication in two’s complement. Following sign extension, one of the numbers is shifted one, two and three places to multiply it by two, four and eight. The other number determines which of these products is added.

The multiplication proceeds as before, except that the shifting is to the right to provide multiplications by one half, one quarter and so on. Note the significant increase in word length that results. Within a mixer, the word length that is supported in the addition that performs the mix must be greater than that of the inputs to allow for word length extension in the level setting processes.

Ultimately the long internal word length of the mixer may need to be shortened in order to record the output on a delivery medium. Shortening the word length of digital audio samples in a sonically optimal manner is a non-trivial process, that will be discussed in due course, but for the moment it is important to point out that even an optimum reduction in word length causes a reduction in quality, even if it is small.

Fig. 5 - An attenuation of 3dB requires multiplication by 0.707decimal. This requires bits below the radix point in binary, representing fractions as shown.

Fig. 5 - An attenuation of 3dB requires multiplication by 0.707decimal. This requires bits below the radix point in binary, representing fractions as shown.

It is only digital recording and transmission media that can be totally transparent, because they output sample values that are totally unchanged from what was input. Any device that performs a process by manipulating sample values cannot be transparent and a small quality loss occurs on every process.

Although the generation loss in a digital process is very small, the losses can build up if inappropriate procedures are followed. For example, if the final level required in an audio signal is hard to establish, there might be several small gain changes made until the level is satisfactory. If the audio waveform is saved after each change, there will be build up of generation loss. However if the final gain is applied to the original samples, there will be only one stage of generation of loss.

You might also like...

Expanding Display Capabilities And The Quest For HDR & WCG

Broadcast image production is intrinsically linked to consumer displays and their capacity to reproduce High Dynamic Range and a Wide Color Gamut.

NDI For Broadcast: Part 2 – The NDI Tool Kit

This second part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to exploring the NDI Tools and what they now offer broadcasters.

HDR & WCG For Broadcast: Part 2 - The Production Challenges Of HDR & WCG

Welcome to Part 2 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 2 discusses expanding display capabilities and…

Great Things Happen When We Learn To Work Together

Why doesn’t everything “just work together”? And how much better would it be if it did? This is an in-depth look at the issues around why production and broadcast systems typically don’t work together and how we can change …

Microphones: Part 1 - Basic Principles

This 11 part series by John Watkinson looks at the scientific theory of microphone design and use, to create a technical reference resource for professional broadcast audio engineers. It begins with the basic principles of what a microphone is and does.