Digital Audio: Part 16 - Filters, Direct Implementation

There are two approaches to digital filtering. One is to implement the impulse response directly. The other is to use recursion. Here we look at the direct implementation.

Fig.1 - In convolution, the impulse response is slid across the input waveform and the area of overlap is measured. The impulse must be mirrored so the tail arrives last.

Fig.1 - In convolution, the impulse response is slid across the input waveform and the area of overlap is measured. The impulse must be mirrored so the tail arrives last.

The theoretical digital audio impulse has zero width and therefore requires infinite bandwidth to reproduce it. That ideal impulse only exists as a concept. All real digital audio systems are bandwidth limited according to sampling theory. Samples can never be heard unless they have passed through a reconstruction filter that forms part of every DAC.

Filters can only reduce the bandwidth of a signal. They can't increase it. Information theory states that the amount of information a signal could carry increases with bandwidth, but a filter has no source of additional information. In other words passing 20kHz of audio bandwidth through a 40kHz low-pass filter still leaves you with 20kHz of audio. Re-mastering old Rolling Stones albums on to SACD (Super Audio CD) can't add to the information limit of the original tapes. That was one of the reasons SACD failed.

However, if we are trying to build an oversampling DAC, we take advantage of that theory. Raising the sampling rate cannot and does not put any more information into the signal, but it does ease the design of the reconstruction filter. If we have a 48kHz audio file and we want to oversample it, the impulse response of the interpolator should be that of a 24kHz low-pass filter, since that is the bandwidth of a 48kHz audio file.

In the previous part the concept of convolution was mentioned. Fig.1 shows how it works. The input waveform (in this example a rectangular pulse) is considered fixed and the impulse response waveform is slid across it. The impulse response needs to be time reversed or mirrored so that the tail of the response arrives last. One exception is if the impulse response is perfectly symmetrical, in which case mirroring can be omitted.

The output is proportional to the area by which the input waveform and the impulse overlap. That area is shown shaded in the figure along with the way the output waveform evolves during the overlapping process.

Fig.2 - In sampled convolution the impulse and the input are moved in steps of one sample period.

Fig.2 - In sampled convolution the impulse and the input are moved in steps of one sample period.

In the digital domain the samples are a constant distance apart, and that means calculating area is simple, because having a constant width, the area is proportional to the sample value. Fig.2 shows an example of sampled convolution. Here the impulse is stepped across the input one sample period at a time. At each step, or point, the area of the overlap is calculated. As the samples have constant width, the area is proportional to the extent the sample values overlap. Mathematicians would call it summation of coincident cross products.

There are many ways of implementing a filter of the type shown, including the use of suitable software in a processor, but the implementation that best illustrates what is actually happening is hardware. Stepping samples one sample period at a time is an obvious application for a shift register.

Fig.3 shows a shift register in which the sample train representing the input waveform can be stepped across. The length of the shift register forms a window in which a fixed number of points are simultaneously available. Any impulse response we choose to use has to fit inside the window, hence this topology is known as a finite impulse response (FIR) filter.

Each point, also known as a tap, of the shift register feeds one input of a multiplier. The other input is a value known as a coefficient, which is effectively a sample of the required impulse response. The products from the multipliers are summed to produce the filter output.

If this filter is supplied with a test impulse, namely a single non-zero sample surrounded by samples of zero value, then as the input shifts across the window, the output of every multiplier except one will be zero. That single multiplier outputs a cross product. Coming out of the summer as the filter shifts will be the impulse response.

With a real audio input there will be samples in every stage and cross products from every stage that will be summed. The FIR filter performs convolution of the impulse response and the input waveform. The FIR filter is always causal, in that any output must occur after the corresponding input has been supplied. Where symmetrical impulse responses are used to obtain phase linearity, the filter causes a delay corresponding to one half the size of the window.

Fig.3 - Stepping the samples in a FIR filter is often done using a shift register.

Fig.3 - Stepping the samples in a FIR filter is often done using a shift register.

The number of multiplications available per output sample determines the finite length of the impulse. This means that practically all impulses will have to be shortened to fit within the available window. The most brutal way of doing that is simply to truncate the impulse, meaning that the parts outside the window are literally chopped off to leave sharp ends.

The rectangular window causes a sudden transition from samples that matter to samples that don't. One might expect some effects on account of that discontinuity and the result is known as the Gibbs phenomenon. Discontinuities contain high frequencies and the one explanation of the Gibbs phenomenon is that it is the filter ringing due to those high frequencies.

Another way of looking at the issue is to consider that windowing restricts the number of samples the filter can see and transform theory suggests that also reduces the frequency resolution of the filter, in the same way that putting a small block into a Discrete Fourier Transform produces a small number of coefficients.

Fig.4 shows the effect of different numbers of points on the Gibbs phenomenon. Increasing the number of points has made the rate of cut-off steeper because the frequency resolution of the filter has gone up in proportion. For the same reason the frequency difference between successive ripples becomes smaller.

Not surprisingly it was found that tapering the coefficients towards the ends of the window gave better results as this has the effect of softening the discontinuities and reducing the ringing. Whilst FIR filters can ring, they only contain forward signal paths and have no feedback. Accordingly they are unconditionally stable, cannot oscillate and must return to zero output within one window width of the input being muted. 

Fig.4 - Showing the effect of increasing the number of points in the filter.

Fig.4 - Showing the effect of increasing the number of points in the filter.

The filter design process consists of selecting an impulse response and a window function and multiplying the two together to obtain coefficients for the multipliers. In practice the coefficients will need to be normalized so that the passband of the filter has unity gain.

There are many such window functions, most of which are variations on a curve that is somewhat Gaussian or bell-shaped and it would be tedious to describe them all here. There is no one ideal window and the one chosen should reflect the best compromise between good stop band attenuation, which may be important to avoid aliasing, the steepness of the filter cut-off and the amount of pass band ripple, which could impair sound quality.

In the stop band, attenuation is obtained when positive and negative products are summed and cancel out. If the coefficients are not sufficiently accurate, the products will be inaccurate and the attenuation will be poor. FIR filter economics is two dimensional, as the window size, or number of points, determines the number of multiplications per sample and the stop band performance determines the coefficient word length. The filter will always be a compromise between performance and cost.

Optimization of filters has a long history and began way before the digital era. The Remez Exchange Algorithm was published in 1934 and was an iterative approach that converged on the optimal design. The delays needed between the taps of an FIR filter are difficult to implement in the analog domain, especially in the case of audio, which covers a wide range of octaves. Delay is trivially easy to implement in the digital domain with no loss of quality so it was inevitable that the FIR filter would increase in importance. That led to the Parks-McClellan algorithm of 1972, which was optimized to design FIR filters.

Fig.5 - A folded filter uses calculated products more than once.

Fig.5 - A folded filter uses calculated products more than once.

The falling cost of processing power famously described by Moore's Law applies to digital filters, making complex filters more economical to implement as time goes by.

When the impulse response is perfectly symmetrical, coefficients at equal distances left and right of the center will be the same. Instead of repeating the same multiplication twice, the filter can be folded so that the product is calculated once but used twice. This approximately halves the number of multiplications per sample. Fig.5 shows a folded filter in which three of the products are used twice. This filter has been simplified for explanatory purposes. A real audio filter would need many more points.

Broadcast Bridge Survey

You might also like...

Deep Learning Accelerates Object Tracking In TV Production

Advances in application motion tracking in audiovisual production, both live and recorded, have been slow until recently accelerated by the advent of modern AI techniques associated with neural network based deep learning and mathematical graph theory. These advances have converged…

The Creative Challenges Of HDR-SDR Simulcast

HDR can make choices easier - or harder - at every stage of production but the biggest challenge may be just how subjective those choices are.

IP Security For Broadcasters: Part 6 - NAT And VPN

NAT will operate without IPsec and vice versa, but making them work together is a fundamental challenge that needs detailed configuration and understanding.

A New Year Speculation On Immersion

As we head into another new year it seems ok to indulge in some obvious speculation about what the future may bring. Here we consider the proposition that eventually, and probably not far into the future, broadcasters will have to…

Microphones: Part 4 - Microphone Technology - The Diaphragm

Most microphones need a diaphragm in order to follow some aspect of the air motion that carries the sound.