Seizing the Opportunities of Immersive Audio in Storytelling

The stars are aligning for a new era of immersive audio in storytelling. Audiobook sales are steadily growing, the popularity of non-musical audio in personal podcasts is exploding and immersive audio technology is making compelling audio cheaper and easier to produce.

Audiobooks are the very definition of personal aural storytelling. At the Tehran International Book Fair in early May, 2017, audio books found new popularity. Navaar, an online platform that produces and sells audiobooks, held competitions to select voice actors. Those who qualified got 10 advanced voice training sessions to record audiobooks. Most of Navaar’s customers are 25 to 35-year-old males and they download a new book at least each month.

Simon and Schuster reports that their audiobook division saw an increase of 35 percent in audiobook sales in the first quarter of 2017, while HarperCollins repeated a seven percent increase in audiobook sales this year. These numbers are not just one time aberrations, but a continuation of a gradual trend over the past few years as sales of audio stories grow upward.

Anthony Goff, Hachette Book Group

Anthony Goff, Hachette Book Group

Anthony Goff, senior vp of content development and audio publisher at Hachette Book Group, told Publisher’s Weekly that his team of 12 people will be producing more than 700 titles in 2017. “When I started here in 2003, we were doing somewhere in the range of 50 to 75 titles a year,” he said.

Hachette has two recording studios and is currently adding a smaller studio in-house. “We’ve picked up a new engineer and a producer,” Goff said. “So we’re growing and our staff is growing, but not nearly as much as the title count and the workload.”

Podcasts are also taking off. After little more than a decade of experimentation and maturation, Ad Age reports that podcasting has hit its stride. With more than 57 million Americans listening monthly (up 23 percent — year on year), the potential is strong for podcasting's star to continue rising through content, discovery, monetization and technology.

Podcast listeners get hooked quickly on niche content. Listeners in the UK consume over six hours each week and those in Australia listen to 5.5 hours per week. With only 21 percent of Americans listening monthly, there’s a huge opportunity for growth in podcasting.

Interestingly, television networks are playing a vital role in helping their viewers get wise to podcasts. In part, this is because networks are creating more companion podcasts for their shows. From exclusive interviews to insider commentary, podcasts ensure fans get their fix across reality, drama and current affairs programming.

With the audience growing for personal audio storytelling, the climate is right for an expansion of immersive audio into these stories. Headphone sales continue to go up, up, up, which creates the perfect medium for innovative sound techniques. This, coupled with new technologies, such as Sennheiser’s AMBEO and Dolby’s Atmos, the opportunity has never been greater.

Though it’s clear that virtual and alternative reality is the driver of the latest immersive audio technologies, many in the industry remain skeptical that VR and AR are where the magic is destined to happen. Others think innovative storytelling for audiobooks and podcasts could just as easily take off to become the “killer app.”

Sreejesh Nair, sound mixer.

Sreejesh Nair, sound mixer.

Sreejesh Nair, a veteran sound mixer in India, has mixed audio on more than 200 feature films. He was part of the first Dolby Atmos mix theater installation in India and the first Dolby Atmos Premiere mix room in the world. He is currently working as a solutions specialist at Avid.

Nair is also an expert at creating compelling sound for storytelling. He has explored emotional responses that people have to audio and found that much of it has to do with the position and proximity of the sound. Nair’s work is with film sound, but it can also extend to audio-only storytelling.

He addressed the subject of reverse engineering emotions in immersive audio last year in a paper given at IBC. Nair’s observations are astute and open the door to a new level of investigation into immersive audio.

“Which is more intimidating: a sound of a twig break or a tiger growl?,” Nair asks.

“Usually the response is twig break. This is because of the amount of imagination we put into the context of the twig break and the reason we create for it based on the environment we are in at that moment. We can identify a tiger growl. But since the twig break can have multiple causes for it, we recognize fear more.”

Nair poses a second question: “Where would be scarier - in front or in the surrounds. The majority of the response was inside the surround. The reason for this is based on episodic memory because a sound without a known source would be more intimidating than one we know.” 


Dynamics, he said, is vitally important in audio storytelling. “The loudness and the attack of the sound signifies urgency or evokes a response. A variation can break what is expected, thereby creating the change needed to remove the listener from monotony and break the emotional barrier. There are three ways to create the dynamics: volume, frequency and position.”

The story in the film, Bombay Velvet, revolves around a time period that spans 20 years. Nair had the sound evolve from mono to left-center-right to 5.1 to 9.1 and finally to Atmos.

“What this allowed us to get was the change in dynamics by varying the audience from becoming first person to second to third and once establishing a pattern. We broke it by creating the emotional response needed without much need of volume or amplitude based changes,” he said. “The change in perspective was created with positional changes thereby engaging the audience much more deeper into the story emotionally.”

Nair noted there are multiple ways of using immersive sound mixes to tell compelling stories. They can include the senses of position, proximity, reality, dislocation, as well as music and placement.

He said the advances happening in reactive experience to sound — such as with virtual reality headsets — can only help achieve the emotional response wanted from the audience.

Sennheiser AMBEO Mic System.

Sennheiser AMBEO Mic System.

The tools are becoming affordable and readily available for experimentation with immersive sound. One example is Sennheiser’s AMBEO microphone system ($1649.00), which delivers a raw, four-channel output called A-format.

Before one can use A-format audio, it must be converted to another four-channel format, Ambisonics B-format. Sennheiser includes an AMBEO A-B format converter plug-in with the microphone. The plugin — in VST, AU and AAX formats, works with any DAWS on Apple’s Macintosh or PCs.

Diving deeper into the technicalities, the Ambisonics B-format is a W, X, Y, Z image of the sound field around the mic, where W is the sum of all four capsules. The X, Y and Z are virtual bidirectional mic patterns that represent front/back, left/right and up/down. Hence, you can audition any direction from the microphone during Ambisonics B playback.

Neumann KU 100 Dummy Head System.

Neumann KU 100 Dummy Head System.

Of course, super realistic audio has been around a long time and many have previously used it creatively. Neumann’s KU 100 Dummy Head system ($7,999.95) is a stereo microphone that resembles the human head. The design implements two condenser elements inside the ears of the head in order to replicate human hearing and achieve authentic, true-to-life stereo imagery and field perception.

The KU 100 uses +48V phantom power or battery power with six 1.5V AA batteries. For headphone-only binaural listening and for speakers, its provides incredibly vivid imagery with simple post-production.

Many options — and varying prices — are available for compelling audio in storytelling. The users creativity is the only restraining factor. Seize the moment. The timing has never been better to stretch the boundaries of audio storytelling.

You might also like...

HDR & WCG For Broadcast: Part 3 - Achieving Simultaneous HDR-SDR Workflows

Welcome to Part 3 of ‘HDR & WCG For Broadcast’ - a major 10 article exploration of the science and practical applications of all aspects of High Dynamic Range and Wide Color Gamut for broadcast production. Part 3 discusses the creative challenges of HDR…

IP Security For Broadcasters: Part 4 - MACsec Explained

IPsec and VPN provide much improved security over untrusted networks such as the internet. However, security may need to improve within a local area network, and to achieve this we have MACsec in our arsenal of security solutions.

IP Security For Broadcasters: Part 3 - IPsec Explained

One of the great advantages of the internet is that it relies on open standards that promote routing of IP packets between multiple networks. But this provides many challenges when considering security. The good news is that we have solutions…

The Resolution Revolution

We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?

Microphones: Part 3 - Human Auditory System

To get the best out of a microphone it is important to understand how it differs from the human ear.