The Changing Face of Audio Processing for the Human Voice

Basic audio processing for narration is so mature that it now free or costs very little. Though it’s easily accessible to anyone, how many recording audio know how to use it?

With the advent of the first audio streaming media about twenty years ago, media production began to shift away from professional studios to homes, garages, offices and just about any place else.

Out of this shift came a revolution in new, lower cost gear and software meant to keep audio quality at a professional level. One of the major innovations of the time was development of processing software that can do just about any kind of manipulation of the human voice.

Voice processing began long ago with expensive, dedicated “black boxes” used in recording studios. Later, different functions were combined into multi-function channel strips. Today, it has evolved into plugins for digital editing applications, as well as DSP firmware embedded in the simplest audio devices. The price of the technology continues to fall dramatically.

Much of the development of voice processing software was driven by musicians who moved into home recording as a result of the collapse of the music industry. From auto-tune software to basic processing tools packaged with computer interfaces, music was initially the main target of the software.

However, voice over artists, professional announcers, news reporters, video producers and podcasters now do an expanding amount of narration from various locations. This growth has led to an abundance of easy-to-use, low-cost voice processing software targeted to non-professional users.

Symetrix Audio Processor

The range of voice processing products runs a wide gamut in features and price. For professional broadcasters, companies like Symetrix, Wheatstone, Omnia, dbx, Aphex, Manley and Yellowtec continue to build hardware-based devices ranging in price from under $1,000 to $6,500. These devices pack everything needed to create and manage the broadcaster’s “sound” from a single microphone to an entire facility.

In the age of low-cost digital audio workstations (DAW), most editing apps now come with a built-in suite of plugins for a range of applications. For voice narrators, these include the basics like a compressor/limiter, noise gate, de-esser, equalization and expander.

Also, available is simple, easy to understand software that enables users to build a visual audio processing chain to automate and repeat functions. Rogue Amoeba’s Audio Hijack ($50) for the Macintosh allows users to align little blocks, each with a function, to record any audio on a personal computer from any source and process it with a range of plug-ins. The software can handle everything, including the number of channels, metering and output devices.

Most who work with audio narration use some form of DAW on their personal computer. They range from Audacity, a free application, all the way up to Avid’s ProTools system. Plugins that work in most DAWs are either VST or AU types and they come from a huge range of audio companies.

Some popular plugins for voice processing include the VOS SlickEQ ($56) from Tokyo Dawn Records. It is a mixing/mastering EQ. Others include the Waves De-Esser ($99) and the FabFilter Pro-DS ($179) plugins. These selectively remove the high frequencies from the input signal when sibilant sounds are present and exceed the threshold level.

SlickEQ

For compression, plugins for Native Instruments’ Solid Bus Comp ($99) and IK Multimedia T-RackS Bus Compressor ($125) are popular with narrators. Also, IK Multimedia’s White 2A Leveling Amplifier is a tube opto compressor/limiter that emulates the legendary vintage all tube-based unit. It brings a gentle, warm and fat compression out of voice tracks where a smooth and consistent compression is needed. It is part of IK Multimedia T-RacksS Custom Shop package, which is priced at $170.

While individual plugins remain popular for specific applications, many casual users prefer packages like Izotope’s Nectar series of applications where all the applications are integrated into one. Nector 2 standard and production suites are the full-featured high-end apps, priced at $299 and $229, accordingly.

Izotope's Nectar Elements

Nector Elements, a slimmed down version that has preset features for voice over and dialogue recording, is priced at $129. It has 100 styles in 12 genres giving non-pro users access to a range of sounds. Ten DSP processors including equalizer, compressor, de-esser, gate, limiter, saturation, pitch correction, reverb, delay and doubler are included. There are also seven equalizer filter shapes for sculpting vocal tracks.

The single control de-esser allows for quick removal of sibilance, while the gate can be used for removing noise or room tone. Customized sliders allow for simple control of all the application’s DSP settings.

Shure MVi

Finally, some voice processing software is finding its way into basic audio devices. A good example is Shure’s new MVi ($129), a tiny portable digital recording adapter for computers and smartphones. Like most other computer audio interfaces, the user plugs a microphone into the XLR or ¼-inch line input to record vocals or instruments on the computing device.

Where the MVi differs, however, are its built-in DSP modes. With the single push of a button, the MVi can provide compression and equalization for a range of applications including speech, singing, acoustic music, loud bands or flat with no processing.

Shure MVi diagram

Using Shure’s free Motiv app, the MVi’s DSP modes are extended with an additional limiter and five-band EQ mode. There’s also 48 volts of phantom power and a 20db boost for extra mic output from dynamic and ribbon models. It powers itself off USB and fits in a coat pocket.

Already, radio announcers and professional voice over artists are using the MVi on the road as a portable interface. The ease of using automatic one-button compression and EQ is a major selling point.

Since recording moved away from studios, technology has gotten not only easier to use but much less expensive. Already, many companies are giving away excellent plugins with the purchase of their gear and some are offering it as free downloads. It can’t get much cheaper.

Now there is no excuse for bad sound if the user understands how and why to use the technology. But that’s another issue.

You might also like...

Audio For Broadcast: Cloud Based Audio

With several industry leading audio vendors demonstrating milestone product releases based on new technology at the 2024 NAB Show, the evolution of cloud-based audio took a significant step forward. In light of these developments the article below replaces previously published content…

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

Next-Gen 5G Contribution: Part 1 - The Technology Of 5G

5G is a collection of standards that encompass a wide array of different use cases, across the entire spectrum of consumer and commercial users. Here we discuss the aspects of it that apply to live video contribution in broadcast production.

Comms In Hybrid SDI - IP - Cloud Systems - Part 2

We continue our examination of the demands placed on hybrid, distributed comms systems and the practical requirements for connectivity, transport and functionality.

Standards: Part 6 - About The ISO 14496 – MPEG-4 Standard

This article describes the various parts of the MPEG-4 standard and discusses how it is much more than a video codec. MPEG-4 describes a sophisticated interactive multimedia platform for deployment on digital TV and the Internet.