Spatial Audio Deep-Dive

Inside the
Sound Sphere

A complete guide to ambisonic recording — how a four-capsule microphone captures an entire acoustic world, and why that changes everything for spatial audio production.

Axis Ambisonic Lab  ·  202512 min readSpatial audio  ·  Field recording
FU (front-up) BL (back-left) BR (back-right) FD (front-down) W (omni)

Tetrahedral capsule arrangement of a first-order ambisonic microphone — four sub-cardioids pointed toward the vertices of an imaginary tetrahedron, encoding the full sound sphere.

Every time a sound wave moves through air, it carries information in every direction simultaneously — up, down, left, right, in front, behind. Conventional microphones throw most of that information away. A cardioid hears forward; a figure-of-eight hears left and right; even a stereo pair only maps a flat horizontal slice of the world.

Ambisonics refuses that bargain. It is a full-sphere capture and reproduction philosophy developed in the 1970s by British mathematician Michael Gerzon and his colleagues at the University of Oxford, built on the insight that any sound field can be decomposed into a set of mathematical functions — spherical harmonics — that together describe sound arriving from every direction at once.

"Instead of recording a pair of perspectives, ambisonic microphones capture the acoustic world as a complete, rotatable object — a sound scene you can walk around after the fact."

This seemingly abstract idea has become the backbone of spatial audio in virtual reality, game audio, documentary filmmaking, scientific acoustics research, and immersive music production. And the hardware that makes it possible is startlingly compact.

The tetrahedral array

At the heart of ambisonic capture is a microphone geometry you will not find anywhere else in recording: four sub-cardioid capsules arranged at the vertices of a regular tetrahedron, packed into a sphere roughly the size of a golf ball. Each capsule looks outward at a different quadrant of the three-dimensional space around it, and their combined signals can be mathematically combined to simulate any directional pattern — a cardioid pointing anywhere, an omnidirectional pickup, a figure-of-eight aimed at the ceiling.

The four raw capsule signals (often called A-format) are converted through a matrixing process into B-format, the four standard channels that define first-order ambisonics: W (omnidirectional), X (front-to-back figure-of-eight), Y (left-to-right), and Z (vertical). These four channels together represent the complete first-order spherical harmonic decomposition of the sound field at the recording point.

Orders of magnitude: from FOA to HOA

First-order ambisonics (FOA) uses four channels and delivers adequate spatial resolution for stereo and 5.1 surround work, though it begins to strain when you push into full three-dimensional reproduction. Higher-order ambisonics (HOA) addresses this by adding more spherical harmonics — and more channels — to capture progressively finer angular detail.

4
First order (FOA)
W, X, Y, Z

Sufficient for stereo and 5.1. Consumer-friendly; the Zoom H3-VR, Sennheiser AMBEO, and Core Sound TetraMic all operate here.

9
Second order
4 + 5 additional

Noticeably sharper imaging. Suitable for professional VR and immersive theatre.

64
Fifth order (HOA)
36 channels

Research and elite production tier. The mh Acoustics em64 Eigenmike captures at this level, with current encoders reaching 5th-order capture.

Each jump in order adds a new layer of spherical harmonics. Going from first to second adds five channels; from second to third adds seven more, totalling sixteen. The spatial precision grows with every step, expanding what researchers call the "sweet area" — the physical zone around the listener where the reproduction holds together — from a narrow sweet spot into a region large enough to contain a moving head or even multiple listeners.

Why ambisonic matters now

Spatial audio has moved from specialist curiosity to mainstream expectation with remarkable speed. YouTube 360, Meta Quest, Apple Vision Pro, Dolby Atmos, Sony 360 Reality Audio — all of them involve delivering sound that exists in three-dimensional space relative to the listener's head position. Ambisonics has become the quiet plumbing beneath much of this infrastructure.

The format holds several structural advantages over its competitors. Unlike channel-based formats such as Dolby Atmos, which hardwire a specific loudspeaker layout into the mix, ambisonics stores an encoding of the sound field itself. That encoding can then be decoded for headphones (binaural), for a simple stereo pair, for a hexagonal speaker ring, or for a 48-speaker dome — the same file, rendered appropriately for wherever it lands.

Crucially, because the B-format scene is defined on a sphere, it can be rotated. That single property is what makes ambisonics perfect for VR and 360-degree video: when a viewer turns their head, the audio scene turns with them, preserving the correspondence between what they see and what they hear.

The recording workflow in practice

The ambisonic pipeline feels unfamiliar at first but becomes intuitive quickly. You record A-format onto four tracks — either into a dedicated recorder like the Zoom H3-VR or through an interface into your DAW. Before you can mix, those four tracks are converted to B-format using a software encoder specific to your microphone model (different capsule geometries require different conversion matrices). From B-format, you can monitor in binaural through headphones, add additional sound sources spatially, and eventually decode for your target playback system.

Plugins such as the Meta 360 Spatial Workstation (free), IEM Plugin Suite (free and open-source), and Sennheiser's AMBEO Orbit handle different parts of this chain. Several now include head-tracker support so you can audition a 360-degree mix by physically turning your head while wearing supported headphones.

Key field recording considerations

In practice, ambisonic field recording rewards the same habits as any location sound work — but with a few spatial-specific priorities. Because the microphone is capturing all directions equally, wind protection matters enormously: a gust that would merely be an annoyance on a cardioid is a gust from everywhere on a tetrahedral array. Invest in a proper windshield, and be mindful that self-noise in all four capsules combines in the A-to-B conversion.

Height matters in a way it simply does not with conventional stereo. The acoustic ceiling of a forest clearing, rain arriving from directly overhead, the whump of a helicopter from above — all of these live in the Z channel and give ambisonic recordings a sense of three-dimensional physical presence that flat stereo cannot replicate.

Applications

Where ambisonics is being used right now

  • 01
    Virtual reality & 360° videoYouTube 360, Meta Quest, and VR cinema all support ambisonic soundtracks. The B-format scene rotates with head-tracking so sound and image stay locked together.
  • 02
    Game audioUnity and Unreal Engine both ship with native ambisonic support. Game sound designers use HOA to create spatially accurate environments where footsteps, reverb, and ambient texture all respond to player perspective.
  • 03
    Documentary & immersive filmLocation recording with a tetrahedral mic alongside a boom preserves the full acoustic character of a place — the hum of a factory, the echo of a cathedral — for decode into any delivery format.
  • 04
    Immersive musicArtists and producers are using ambisonics to place instruments in three-dimensional space — above, behind, sweeping overhead — for Dolby Atmos and 360 Reality Audio releases.
  • 05
    Acoustics research & soundscape scienceResearchers in psychology, environmental acoustics, and urban planning use ambisonic recordings to recreate outdoor soundscapes accurately in laboratory listening experiments.

A working guide to common ambisonic microphones

The market for ambisonic microphones has matured considerably. You can now choose hardware calibrated for field documentary work, studio immersive production, or high-order research capture — often at strikingly different price points.

Microphone Order Best suited for Notes
Zoom H3-VR FOA Run-and-gun field work, VR video Records 360° ambisonic and stereo simultaneously. Compact, affordable, battery-powered — the go-to for solo field recordists.
Sennheiser AMBEO VR Mic FOA Professional location & studio High capsule consistency and low self-noise. Widely used in broadcast and documentary. Requires external recorder or interface.
Core Sound TetraMic FOA Field recording, research Compact tetrahedral design. Beloved by the field recording community. Requires external preamps and A-to-B software.
Zylia ZM-1 HOA 3rd Immersive music, live capture 19-capsule spherical array. Connects via USB. Good entry point into higher-order work without a full research budget.
mh Acoustics em64 Eigenmike HOA 5th Research, elite production 64 capsules on a rigid sphere. The reference standard for HOA research and high-end immersive production. Significant investment.

The open-source advantage

One reason ambisonics has spread so widely outside the professional studio world is that it is, at its mathematical core, an open standard. Unlike Dolby Atmos — a patented format dominant in mainstream film and music — the ambisonic signal chain from recording through decoding can be implemented entirely with free, open-source tools.

The IEM Plugin Suite, developed at the Institute of Electronic Music and Acoustics in Graz, provides a complete HOA production environment at no cost. The Ambisonic Toolkit (ATK) covers composition and research workflows. The B-format format itself is unencumbered by licensing. This openness has made ambisonics the preferred format in the video game industry and in academic research, where the ability to examine, modify, and share the full signal chain matters.

What AI is beginning to change

Recent research has introduced neural networks into the ambisonic pipeline. Neural ambisonic encoding can now handle microphone arrays of arbitrary geometry — not just perfectly symmetric spheres — and convert their signals into spherical harmonic representations. This matters for consumer devices: the microphone arrays built into smartphones, laptops, and AR glasses are almost never arranged in a clean tetrahedron. Learned encoding offers a path to capturing spatial audio from hardware that was never designed for it.

AI is also being applied to ambisonic source separation (isolating individual sound sources within a captured scene) and to personalized binaural rendering, where HRTF models are adapted to individual listeners' ear geometry for more accurate headphone reproduction. These developments suggest that the next decade of spatial audio will be defined less by new microphone hardware than by the intelligence applied to the signals those microphones already capture.

The fundamental insight of Gerzon's work — that a sound field is a mathematical object, not a fixed set of perspectives — has proven more durable than any particular piece of hardware.

Getting started at Axis Ambisonic Lab

Whether you are coming to ambisonics from field recording, from game audio, from documentary filmmaking, or from pure curiosity about the physics of sound, the entry point is more accessible than it has ever been. A Zoom H3-VR and a free copy of the IEM Suite will take you from recording a forest clearing to delivering a fully decoded binaural mix in an afternoon.

Higher-order work demands more — more channels, more processing, more careful thinking about what you want the listener to experience — but the payoff is a representation of acoustic space that approaches the fidelity of being there. That is the project of Axis Ambisonic Lab: helping recordists, researchers, and producers understand and work with sound in its fullest, most spatial form.

The sound sphere is waiting. Step inside it.

Ready to record in three dimensions?

Explore tutorials, listening examples, and gear reviews at Axis Ambisonic Lab — your home for serious spatial audio craft.

Visit the Lab →