The Sponsors Perspective: An Ambisonics Primer

Sennheiser examines the theory, implementation, and uses of the Ambisonic soundfield, and its important role in the immersive audio world.

This article was first published as part of Essential Guide: Immersive Audio Pt 2 - Immersive Audio Compatibility

Ambisonics is probably the original speaker-agnostic immersive format, and it’s been waiting a while for everyone to catch up. If you’re familiar with the Mid-Side microphone technique, that gives you an idea of how this format works - in those terms, ‘first-order’ Ambisonics is essentially a central omni-directional ‘’mid’ or pressure component (W), plus three different ‘side’ figure-of-eights: Back-front (X), orientated left-right (Y), and up-down (Z). These four signals make up the so-called ‘B-Format’ first-order Ambisonic format.

This is not an object-based format like Atmos. In fact, if you tried to split Ambisonics up into individual objects with position you would defeat one of its most useful features. All components, together, form the entire soundfield and are, as such, inseparable. However, it is a speaker-agnostic immersive format as it does describe a full 360-degree sound field without referencing speaker positions.

Because of the way this format stores the soundfield, it can easily be ‘decoded’ into any type of speaker set-up or number of speakers and panning and effects can be implemented directly in B-format, which maintains that speaker-agnostic status and explains its starring role in the upcoming 360-degree video boom - especially with live head tracking, which enables audio sources to effectively remain static in the space, while the video reflects the viewing angle. It can also be relatively easily encoded with environment and/or HRTF at the replay end if required to enhance the soundfield for headphones (see Essential Guide “Immersive Audio – Part 1” on binaural audio and the personalised HRTF).

The Sennheiser AMBEO A-B converter for transforming A-Format channels into B-Format Ambisonics components.

Getting High

Higher Order Ambisonics (HOA) - anything above 1st order - effectively increases the number of ‘sides’ in our virtual Ambisonic microphone (except they’re no-longer figure-of-eights) - a mathematical idea termed Spherical harmonics. As you work your way up the ‘orders’ of Ambisonics, effective resolution of the sound field increases, the sweet-spot gets bigger, and the number of channels required goes up too: For second order Ambisonics you need nine channels, for third-order you need 16.

You don’t need a microphone to create and work with an Ambisonic sound field. There are plenty of Ambisonic panning and processing tools available for different platforms, DAWs, phones, and so on, including headphone encoders for working on Ambisonics when you don’t have the luxury of lots of speakers, along with head tracking options so you can effectively monitor your head-tracking-enabled VR 360 mixes.

Ambisonic audio is specified as an option for both MPEG-H Audio and for DTS-UHD, and therefore can be part of DVB-MPEG/UHD or ATSC 3.0 broadcasts.

Slightly confusingly, standard formats and files for higher order Ambisonics are rather fraught with variations, mainly because there are different options for the derivation and ordering of the spherical harmonic components. The main sequences are ACN and Furse-Malham (FuMa). ACN starts with WYZX for 1st order while FuMa starts with WXYZ. It’s important to be aware that of the potential for mixing up the order, which will definitely lead to a disappointing, or disorientating, Ambisonic experience. There are also different options for the normalisation of those components such as maxN (for FuMa ordering), SN3D, N3D. and more. Of the proposed file formats, AmbiX seems to be the most popular option and is scalable to any order. It uses ACN ordering, SN3D normalisation, and the core audio format (.caf) container.

YouTube and Facebook now support 360 video and Ambisonic audio and in fact there is a free software suite called Facebook 360 Spatial Workstation available for designing spatial audio for Facebook, also compatible with YouTube 360 spatial audio metadata. YouTube’s encoding process specifies the Spatial Media Metadata Injector.

Ambisonic Microphones

The standard way of recording Ambisonics has always been a tetrahedral array or cardioid capsules. This was first seen in the Soundfield Microphone, brought to market in the 70s by Calrec. More recently, a good number of tetrahedral array mics have come to market, made economically viable by the upsurge of interest in immersive audio and probably, in particular, the 360 video trend.

The Sennhesier AMBEO VR microphone for Ambisonic recording.

The raw audio from a tetrahedral array of cardioid microphones is normally termed ‘A-format’. This can then be transformed into the B-Format 1st-order Ambisonic components of W, X, Y, and Z.

The Sennheiser AMBEO VR microphone is one such product and fits into the Sennheiser AMBEO immersive technology landscape along with products like the free AMBEO Orbit plug-in for mixing various sources into binaural audio, plug-ins from it’s partner in VR, Dear Reality, the Neumann KU 100 dummy head microphone, and - for the end-user - the high-end Sennheiser AMBEO Soundbar.

The AMBEO VR microphone uses four matched KE 14 capsules and outputs four corresponding audio channels for the A-Format feed. It also comes with the A-B converter tool for getting the A-Format signal into a DAW in B-Format with various adjustments, such as FuMa or AmbiX ordering / normalisation, microphone position, and filters.

Ambisonic Potential

The rise of Ambisonics has been a long-time coming. The very fact that people are waking up to the advantages of speaker-agnostic immersive audio, and that the consumer now has the technology and every opportunity to experience it in many convenient forms, is driving this boost.

It fits very nicely into the grand immersive scheme along with object -based audio, channel-based beds with height, and with binaural audio for headphones, which is why it’s included in the MPEG-H Audio and DTS-UHD specs. A-format capture is well-suited to encoding into channel-based bed as well, so even if you didn’t want to include the raw Ambisonic channels, the techniques and technology can be the basis of a high-quality ambience feed for sports broadcast and so on.

Ambisonics should be a valuable part of your immersive audio toolbox.

Other related articles posted on The Broadcast Bridge.

Essential Guide: Immersive Audio Pt 2 - Immersive Audio Compatibility

Supported by

You might also like...

Monitoring & Compliance In Broadcast: Monitoring QoS & QoE To Power Monetization

Measuring Quality of Experience (QoE) as perceived by viewers has become critical for monetization both from targeted advertising and direct content consumption.

Live Sports Production: Backhaul In Live Sports Production

Getting content reliably and securely from venue to studio remains key to live sports production so here we discuss the technology and services required.

Local TV In The U.S.A – 1967 Style

Our very own TV pioneer shares recollections of local TV in the US from his start in 1967.

Monitoring & Compliance In Broadcast: Monitoring Delivery In The Converged OTA – OTT Ecosystem

Convergence or coexistence between linear broadcast, IP based delivery and 5G mobile networks creates new challenges for monitoring of delivery paths, both technically and logistically.

Live Sports Production: Broadcast Controllers & Orchestration In Live Sports Systems

As production infrastructure, processing resources and the underlying networks required become ever more complex, powerful tools are required to plan, deploy and monitor.