WAVE-FORMAT-EXTENSIBLE AND B-FORMAT AUDIO

While it will always be possible to store B-format ambisonic audio in a generic multi-channel file (to be played through any suitable soundcard), the development of WAVE-FORMAT-EXTENSIBLE ('WAVE-EX')  has created an opportunity to define a proper distinct file format for Ambisonically-encoded audio. This will hopefully encourage developers to support B-format audio in Audio Workstations and related applications, which in turn will hopefully lead to a wider exploitation of Ambisonics by composers and audio developers.
 

The format definition below should be read in conjunction with the Microsoft document detailing WAVE-FORMAT-EXTENSIBLE.

The WAVE-EX format allows for new 'Subtype' Globally Unique IDentifers (GUIDs) to be defined (by anyone) for custom soundfile formats. It is appropriate to use this for B-format since it is reasonable to send such data directly to a soundcard - e.g to an external B-Format decoder, or to a software 'plugin' decoder. The B-format signals, if not, strictly speaking, speaker  feeds, are nevertheless normal audio signals, and can reasonably be processed in the real-time streaming environment for which WAVE-EX was created.

There are two B-Format GUIDs, for integer and floating-point word formats:

Integer format:
SUBTYPE_AMBISONIC_B_FORMAT_PCM
 {00000001-0721-11d3-8644-C8C1CA000000}

Floating-point format:

SUBTYPE_AMBISONIC_B_FORMAT_IEEE_FLOAT
{00000003-0721-11d3-8644-C8C1CA000000}

The four B-format signals are interleaved for each sample frame in the order W,X,Y,Z.

If the extended six-channel B-Format is used, the U and V signals will occupy the fifth and sixth slots: W,X,Y,Z,U,V.

If horizontal-only B-format  is to be represented, a three or five-channel file will suffice, with signals interleaved as W,X,Y (First Order), or W,X,Y,U,V (Second-order). However, four and -six-channel files are also acceptable, with the Z channel empty. Higher-order configurations are possible in theory, but are not addressed here. A decoder program should either 'degrade gracefully', or reject formats it cannot handle.

For all B-format configurations, the dwChannelMask field should be set to zero.

Though strictly speaking an optional chunk, it is recommended that the PEAK chunk be used for all B-Format files. Apart from its general utility, it has the special virtue for B-format in that applications can determine from the peak value for the Z channel  whether the file is indeed full periphonic B-format (with height information), or 'Horizontal-only' (Z channel present but empty).

Reading and writing a GUID.

The GUID is written to a WAVE_EX header as a structure:

typedef struct _GUID
{
    unsigned long        Data1;
    unsigned short       Data2;
    unsigned short       Data3;
    unsigned char        Data4[8];
} GUID;

Thus, the SUBTYPE_AMBISONIC_B_FORMAT_PCM GUID will be written as:
{0x00000001,0x0000,0x0010,  {0x80,0x00, 0x00,0xaa,0x00,0x38, 0x9b, 0x71}}
 

Since the Subtype GUID in a WAVE-EX file follows the usual requirements of a WAVE file, the three numeric elements of this structure are written to disk in little-endian format (least significant bytes at the lower addresses). The remaining eight bytes are written in sequence as for any string.

last updated: 23rd May 1999

Return to NOS-DREAM Home Page