PVAN

Analyse a sound using the SNDAN Phase Vocoder method.

Output is a file in the .an format.

Associated programs:

monan
addsyn, addsyn1, addsyn2, addsyn3

The file created by pvan, with the extension .an, contains a series of analysis 'frames' defining the amplitude and frequency of a harmonic, relative to a fundamental frequency (the 'analysis frequency') you give the program. Each frame is a snapshot of the state of the sound at a very small instant in time. Thus, the whole file describes the changing character of the sound over time. In this format, the time, amplitude and frequency information can be easily modified independently for each harmonic.

The assumption made by pvan is that the frequencies of the harmonics do not change very much over time, i..e. that the pitch and timbre of the sound is fairly steady (as is the case with most string, wind and brass tones, for example). Thus the frequency information for each harmonic is expressed not as an absolute frequency in Hertz, but in terms of the detected deviations from the theoretical 'centre frequency' of that harmonic. These deviations can naturally be significant if the performer has used vibrato, or if they have been a little unsteady.

Of course, it is sometimes not easy to find out exactly what the fundamental pitch of a sound is. It does help. for example, to have at least a tuning fork handy (if you have good 'relative pitch' and can work out intervals), or an electronic chromatic tuner, or a music keyboard. If you know the musical pitch of a note, you can look up its frequency in the Pitch Table. However, it is fortunately not essential that you know the pitch of the tone exactly. After all, this may be one of the things you need SNDAN to find out for you!

When it analyses a tone, pvan calculates the overall or average base frequency in each frame. Other programs, such as monan, can use this information to display the measured true fundamental pitch of the tone. If this is not reasonably close to the pitch you used running pvan, you can re-analyse the sound with the revised pitch, to get an even more accurate analysis. In practice, since pitch and frequency deviations can go down (become negative) as well as up, you need to use an analysis frequency that is low enough to include these deviations.

All processing is performed in memory. While WIN32 systems all support virtual memory (where hard disk storage is used to mimic memory), so that large files that would not fit in physical memory can be processed, this will inevitably slow down processing, so the processing of long sounds should generally be avoided.

Running pvan.

pvan requires several parameters:

pvan [-x] [-aN] frequency headfile infile outfile [nhars]

where:
-x = exchange byte-order for .fp and .sh files.

You may need to use this if importing raw soundfiles created on a different computer architecture, such as theApple Macintosh. See File Formats for the soundfile formats supported by SNDAN.
-aN = set output file format
N = 1 : output is in 'full' format. Should not normally be used.
This format is not supported directly in SNDAN; it is converted internally by monan and other programs to 'simple' format. It has been activated in this version to support projects to enable conversion with other phase-vocoder formats, such as those used in Csound and the CDP system. 'Full' format retains all the data generated by FFT analysis, and is thus very close to other 'raw' phase vocoder formats. See Technical Notes.
N = 1: output is in 'simple' format (default)
This is the main format used by SNDAN programs. Each analysis frame contains amplitude and frequency information for nhars harmonics, or the maximum available, according to the analysis frequency and the sample rate of the infile. All values are stored as floating-point.
N = 2: output is in 'compact' format
This can be used where disk space is at a premium. Values are stored as 16bit integers, thus taking up half the space of the equivalent 'simple' format file. It is a reasonable format to use where there is no danger of over-range amplitude values. Frequency values are scaled to make full use of the numeric range available, so that precision is preserved. It is les suitable for use with floating-point source sounds, where presumably the highest audio quality is required.

frequency

The estimated fundamental pitch (Hz) of the source sound. This will become the 'analysis frequency' fa contained in the analysis file header, and used by monan as the basis for all frequency calculations.

headfile

The .head file created for the sound using mkheader

infile

the source soundfile. Must be mono, in one of the file formats supported by SNDAN.

outfile

The output analysis file. The name must include the .an extension. If the file already exists, you will be asked if you want to over-write it.

nhars

Optional. The number of harmonics to retain. This can be especially useful if frequency is low, and you only want to work with the lower harmonics. This becomes in effect a powerful low-pass filter, as higher harmonics are eliminated during analysis. Note that a similar facility is available with the synthesis programs (see, for example, addsyn).

[Top] [SNDAN home page]

Technical Notes
pvan, in common with mqan and all phase vocoder algorithms, uses the Fast Fourier Transform (FFT) to convert each block of audio samples to an amplitude/frequency representation. For efficiency, the size of the anlaysis block (the 'analysis window') must be a 'power of two', such as 128, 256 and 512 samples. Since physical signals are unlikely to fit these sizes exactly, pvan resamples the source, based on the value of frequency, so that hopefully one or two complete periods of the waveform will fit exactly. This ensures the greatest possible accuracy in the analysis. The full FFT analysis frame is retained when the data is written in 'full' format. However, the data cannot be directly exported to a different phase vocoder format. Though the numeric format may be virtually identical, a 'raw' phase vocoder based synthesizer, without knowledge of the sample rate conversion, would create a sound at the wrong pitch.
Before resampling, both pvan and mqan apply a slight high-frequency emphasis to the sound, using a simple high-pass filter. This helps to discriminate higher-frequency partials. The analysis programs all compensate for this emphasis. It does not normally need to be worried about, except where attempts may be made to convert the analysis files into other formats.
Note that an 'inverse FFT' is not used in SNDAN for resynthesis. All resynthesis in SNDAN uses the oscillator-bank method.