The Sponsors Perspective: An Ambisonics Primer

Sennheiser examines the theory, implementation, and uses of the Ambisonic soundfield, and its important role in the immersive audio world.


This article was first published as part of Essential Guide: Immersive Audio Pt 2 - Immersive Audio Compatibility

Ambisonics is probably the original speaker-agnostic immersive format, and it’s been waiting a while for everyone to catch up. If you’re familiar with the Mid-Side microphone technique, that gives you an idea of how this format works - in those terms, ‘first-order’ Ambisonics is essentially a central omni-directional ‘’mid’ or pressure component (W), plus three different ‘side’ figure-of-eights: Back-front (X), orientated left-right (Y), and up-down (Z). These four signals make up the so-called ‘B-Format’ first-order Ambisonic format.

This is not an object-based format like Atmos. In fact, if you tried to split Ambisonics up into individual objects with position you would defeat one of its most useful features. All components, together, form the entire soundfield and are, as such, inseparable. However, it is a speaker-agnostic immersive format as it does describe a full 360-degree sound field without referencing speaker positions.

Because of the way this format stores the soundfield, it can easily be ‘decoded’ into any type of speaker set-up or number of speakers and panning and effects can be implemented directly in B-format, which maintains that speaker-agnostic status and explains its starring role in the upcoming 360-degree video boom - especially with live head tracking, which enables audio sources to effectively remain static in the space, while the video reflects the viewing angle. It can also be relatively easily encoded with environment and/or HRTF at the replay end if required to enhance the soundfield for headphones (see Essential Guide “Immersive Audio – Part 1” on binaural audio and the personalised HRTF).

The Sennheiser AMBEO A-B converter for transforming A-Format channels into B-Format Ambisonics components.

The Sennheiser AMBEO A-B converter for transforming A-Format channels into B-Format Ambisonics components.

Getting High

Higher Order Ambisonics (HOA) - anything above 1st order - effectively increases the number of ‘sides’ in our virtual Ambisonic microphone (except they’re no-longer figure-of-eights) - a mathematical idea termed Spherical harmonics. As you work your way up the ‘orders’ of Ambisonics, effective resolution of the sound field increases, the sweet-spot gets bigger, and the number of channels required goes up too: For second order Ambisonics you need nine channels, for third-order you need 16.

You don’t need a microphone to create and work with an Ambisonic sound field. There are plenty of Ambisonic panning and processing tools available for different platforms, DAWs, phones, and so on, including headphone encoders for working on Ambisonics when you don’t have the luxury of lots of speakers, along with head tracking options so you can effectively monitor your head-tracking-enabled VR 360 mixes.

Ambisonic audio is specified as an option for both MPEG-H Audio and for DTS-UHD, and therefore can be part of DVB-MPEG/UHD or ATSC 3.0 broadcasts.

Slightly confusingly, standard formats and files for higher order Ambisonics are rather fraught with variations, mainly because there are different options for the derivation and ordering of the spherical harmonic components. The main sequences are ACN and Furse-Malham (FuMa). ACN starts with WYZX for 1st order while FuMa starts with WXYZ. It’s important to be aware that of the potential for mixing up the order, which will definitely lead to a disappointing, or disorientating, Ambisonic experience. There are also different options for the normalisation of those components such as maxN (for FuMa ordering), SN3D, N3D. and more. Of the proposed file formats, AmbiX seems to be the most popular option and is scalable to any order. It uses ACN ordering, SN3D normalisation, and the core audio format (.caf) container.

YouTube and Facebook now support 360 video and Ambisonic audio and in fact there is a free software suite called Facebook 360 Spatial Workstation available for designing spatial audio for Facebook, also compatible with YouTube 360 spatial audio metadata. YouTube’s encoding process specifies the Spatial Media Metadata Injector.

Ambisonic Microphones

The standard way of recording Ambisonics has always been a tetrahedral array or cardioid capsules. This was first seen in the Soundfield Microphone, brought to market in the 70s by Calrec. More recently, a good number of tetrahedral array mics have come to market, made economically viable by the upsurge of interest in immersive audio and probably, in particular, the 360 video trend.

The Sennhesier AMBEO VR microphone for Ambisonic recording.

The Sennhesier AMBEO VR microphone for Ambisonic recording.

The raw audio from a tetrahedral array of cardioid microphones is normally termed ‘A-format’. This can then be transformed into the B-Format 1st-order Ambisonic components of W, X, Y, and Z.

The Sennheiser AMBEO VR microphone is one such product and fits into the Sennheiser AMBEO immersive technology landscape along with products like the free AMBEO Orbit plug-in for mixing various sources into binaural audio, plug-ins from it’s partner in VR, Dear Reality, the Neumann KU 100 dummy head microphone, and - for the end-user - the high-end Sennheiser AMBEO Soundbar.

The AMBEO VR microphone uses four matched KE 14 capsules and outputs four corresponding audio channels for the A-Format feed. It also comes with the A-B converter tool for getting the A-Format signal into a DAW in B-Format with various adjustments, such as FuMa or AmbiX ordering / normalisation, microphone position, and filters.

Ambisonic Potential

The rise of Ambisonics has been a long-time coming. The very fact that people are waking up to the advantages of speaker-agnostic immersive audio, and that the consumer now has the technology and every opportunity to experience it in many convenient forms, is driving this boost.

It fits very nicely into the grand immersive scheme along with object -based audio, channel-based beds with height, and with binaural audio for headphones, which is why it’s included in the MPEG-H Audio and DTS-UHD specs. A-format capture is well-suited to encoding into channel-based bed as well, so even if you didn’t want to include the raw Ambisonic channels, the techniques and technology can be the basis of a high-quality ambience feed for sports broadcast and so on.

Ambisonics should be a valuable part of your immersive audio toolbox. 

Supported by

You might also like...

IP Security For Broadcasters: Part 1 - Psychology Of Security

As engineers and technologists, it’s easy to become bogged down in the technical solutions that maintain high levels of computer security, but the first port of call in designing any secure system should be to consider the user and t…

Demands On Production With HDR & WCG

The adoption of HDR requires adjustments in workflow that place different requirements on both people and technology, especially when multiple formats are required simultaneously.

NDI For Broadcast: Part 3 – Bridging The Gap

This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Designing An LED Wall Display For Virtual Production - Part 2

We conclude our discussion of how the LED wall is far more than just a backdrop for the actors on a virtual production stage - it must be calibrated to work in harmony with camera, tracking and lighting systems in…