Every time a sound wave moves through air, it carries information in every direction simultaneously — up, down, left, right, in front, behind. Conventional microphones throw most of that information away. A cardioid hears forward; a figure-of-eight hears left and right; even a stereo pair only maps a flat horizontal slice of the world.
Ambisonics refuses that bargain. It is a full-sphere capture and reproduction philosophy developed in the 1970s by British mathematician Michael Gerzon and his colleagues at the University of Oxford, built on the insight that any sound field can be decomposed into a set of mathematical functions — spherical harmonics — that together describe sound arriving from every direction at once.
"Instead of recording a pair of perspectives, ambisonic microphones capture the acoustic world as a complete, rotatable object — a sound scene you can walk around after the fact."
This seemingly abstract idea has become the backbone of spatial audio in virtual reality, game audio, documentary filmmaking, scientific acoustics research, and immersive music production. And the hardware that makes it possible is startlingly compact.
The tetrahedral array
At the heart of ambisonic capture is a microphone geometry you will not find anywhere else in recording: four sub-cardioid capsules arranged at the vertices of a regular tetrahedron, packed into a sphere roughly the size of a golf ball. Each capsule looks outward at a different quadrant of the three-dimensional space around it, and their combined signals can be mathematically combined to simulate any directional pattern — a cardioid pointing anywhere, an omnidirectional pickup, a figure-of-eight aimed at the ceiling.
The four raw capsule signals (often called A-format) are converted through a matrixing process into B-format, the four standard channels that define first-order ambisonics: W (omnidirectional), X (front-to-back figure-of-eight), Y (left-to-right), and Z (vertical). These four channels together represent the complete first-order spherical harmonic decomposition of the sound field at the recording point.
Orders of magnitude: from FOA to HOA
First-order ambisonics (FOA) uses four channels and delivers adequate spatial resolution for stereo and 5.1 surround work, though it begins to strain when you push into full three-dimensional reproduction. Higher-order ambisonics (HOA) addresses this by adding more spherical harmonics — and more channels — to capture progressively finer angular detail.
Sufficient for stereo and 5.1. Consumer-friendly; the Zoom H3-VR, Sennheiser AMBEO, and Core Sound TetraMic all operate here.
Noticeably sharper imaging. Suitable for professional VR and immersive theatre.
Research and elite production tier. The mh Acoustics em64 Eigenmike captures at this level, with current encoders reaching 5th-order capture.
Each jump in order adds a new layer of spherical harmonics. Going from first to second adds five channels; from second to third adds seven more, totalling sixteen. The spatial precision grows with every step, expanding what researchers call the "sweet area" — the physical zone around the listener where the reproduction holds together — from a narrow sweet spot into a region large enough to contain a moving head or even multiple listeners.
Why ambisonic matters now
Spatial audio has moved from specialist curiosity to mainstream expectation with remarkable speed. YouTube 360, Meta Quest, Apple Vision Pro, Dolby Atmos, Sony 360 Reality Audio — all of them involve delivering sound that exists in three-dimensional space relative to the listener's head position. Ambisonics has become the quiet plumbing beneath much of this infrastructure.
The format holds several structural advantages over its competitors. Unlike channel-based formats such as Dolby Atmos, which hardwire a specific loudspeaker layout into the mix, ambisonics stores an encoding of the sound field itself. That encoding can then be decoded for headphones (binaural), for a simple stereo pair, for a hexagonal speaker ring, or for a 48-speaker dome — the same file, rendered appropriately for wherever it lands.
Crucially, because the B-format scene is defined on a sphere, it can be rotated. That single property is what makes ambisonics perfect for VR and 360-degree video: when a viewer turns their head, the audio scene turns with them, preserving the correspondence between what they see and what they hear.
The recording workflow in practice
The ambisonic pipeline feels unfamiliar at first but becomes intuitive quickly. You record A-format onto four tracks — either into a dedicated recorder like the Zoom H3-VR or through an interface into your DAW. Before you can mix, those four tracks are converted to B-format using a software encoder specific to your microphone model (different capsule geometries require different conversion matrices). From B-format, you can monitor in binaural through headphones, add additional sound sources spatially, and eventually decode for your target playback system.
Plugins such as the Meta 360 Spatial Workstation (free), IEM Plugin Suite (free and open-source), and Sennheiser's AMBEO Orbit handle different parts of this chain. Several now include head-tracker support so you can audition a 360-degree mix by physically turning your head while wearing supported headphones.
Key field recording considerations
In practice, ambisonic field recording rewards the same habits as any location sound work — but with a few spatial-specific priorities. Because the microphone is capturing all directions equally, wind protection matters enormously: a gust that would merely be an annoyance on a cardioid is a gust from everywhere on a tetrahedral array. Invest in a proper windshield, and be mindful that self-noise in all four capsules combines in the A-to-B conversion.
Height matters in a way it simply does not with conventional stereo. The acoustic ceiling of a forest clearing, rain arriving from directly overhead, the whump of a helicopter from above — all of these live in the Z channel and give ambisonic recordings a sense of three-dimensional physical presence that flat stereo cannot replicate.