While it will always be possible to store B-format ambisonic audio in
a generic multi-channel file (to be played through any suitable soundcard),
the development of WAVE-FORMAT-EXTENSIBLE ('WAVE-EX') has created
an opportunity to define a proper distinct file format for Ambisonically-encoded
audio. This will hopefully encourage developers to support B-format audio
in Audio Workstations and related applications, which in turn will hopefully
lead to a wider exploitation of Ambisonics by composers and audio developers.
The format definition below should be read in conjunction with the Microsoft document detailing WAVE-FORMAT-EXTENSIBLE.
The WAVE-EX format allows for new 'Subtype' Globally Unique IDentifers (GUIDs) to be defined (by anyone) for custom soundfile formats. It is appropriate to use this for B-format since it is reasonable to send such data directly to a soundcard - e.g to an external B-Format decoder, or to a software 'plugin' decoder. The B-format signals, if not, strictly speaking, speaker feeds, are nevertheless normal audio signals, and can reasonably be processed in the real-time streaming environment for which WAVE-EX was created.
There are two B-Format GUIDs, for integer and floating-point word formats:
Integer format:
SUBTYPE_AMBISONIC_B_FORMAT_PCM
{00000001-0721-11d3-8644-C8C1CA000000}
Floating-point format:
SUBTYPE_AMBISONIC_B_FORMAT_IEEE_FLOAT
{00000003-0721-11d3-8644-C8C1CA000000}
The four B-format signals are interleaved for each sample frame in the order W,X,Y,Z.
If the extended six-channel B-Format is used, the U and V signals will occupy the fifth and sixth slots: W,X,Y,Z,U,V.
If horizontal-only B-format is to be represented, a three or five-channel file will suffice, with signals interleaved as W,X,Y (First Order), or W,X,Y,U,V (Second-order). However, four and -six-channel files are also acceptable, with the Z channel empty. Higher-order configurations are possible in theory, but are not addressed here. A decoder program should either 'degrade gracefully', or reject formats it cannot handle.
For all B-format configurations, the dwChannelMask field should be set to zero.
Though strictly speaking an optional chunk, it is recommended that the PEAK chunk be used for all B-Format files. Apart from its general utility, it has the special virtue for B-format in that applications can determine from the peak value for the Z channel whether the file is indeed full periphonic B-format (with height information), or 'Horizontal-only' (Z channel present but empty).
The GUID is written to a WAVE_EX header as a structure:
typedef struct _GUID
{
unsigned long
Data1;
unsigned short
Data2;
unsigned short
Data3;
unsigned char
Data4[8];
} GUID;
Thus, the SUBTYPE_AMBISONIC_B_FORMAT_PCM GUID will be written as:
{0x00000001,0x0000,0x0010, {0x80,0x00, 0x00,0xaa,0x00,0x38, 0x9b,
0x71}}
Since the Subtype GUID in a WAVE-EX file follows the usual requirements of a WAVE file, the three numeric elements of this structure are written to disk in little-endian format (least significant bytes at the lower addresses). The remaining eight bytes are written in sequence as for any string.
last updated: 23rd May 1999
Return to NOS-DREAM Home Page