MONAN

Display and modify SNDAN analysis files (.an format)


launching   exiting  using    common parameters   notation conventions
display/analysis    transformation   file handling
commands in detail
 

monan is a comprehensive program for display, modification, and synthesis of spectrum analysis data stored in harmonic analysis files (.an files). Over 40 commands are available for displaying analysis data, applying modification and transformations, and substituting new data, together with commands for generating and playing soundfiles. Thus, in addition to its primary purpose as  an analytical tool, monan offers a number of  facilities for composers and sound designers. Transformations include pitch shifting and time-stretching, with a special command devoted to time compression preserving the attack and decay. Because the source code is available, C programmers can without much difficulty devise their own transformations which can either form a separate program, or be incorporated into monan. Many of the current commands have entered the program in just this way.

Input files must be in the .an format, either generated directly by pvan, or converted from .mq files (usually by means of mqtoan). For graphic output, monan writes .eps files, and automatically launching the installed associated program to display them. This will usually be GSVIEW - see Installing SNDAN for more information.
 

Launching monan


monan can be started simply by typing its name in at the Windows MSDOS console prompt. It will then ask the user for an analysis file to load. If the file is in the current directory (the directory you are in when you run monan) just type the filename. Otherwise, you will need to type the full Windows path to it (see Using the Windows Console). Alternatively, the filename can be added directly as a command line parameter after the program name. At this stage, if the file cannot be opened, for any reason, monan exits.

If the file is successfully opened, monan displays information about it (based on the header previously used to create the file), and asks how many harmonics to load. Just press <Enter> to load all harmonics. monan then enters the interactive command cycle. Some commands have their own sub-menus (e.g the pp command), while others may ask directly for further parameters. On completion of each command monan returns to its command prompt. Note that to use some commands, some of the information displayed when monan loads a file is useful when entering parameters, so you need to ensure you can scroll back to it, or otherwise make a note of it somehow.

Invalid commands, or just pressing <Enter> will produce the friendly message:

 huh? Type ‘lc’ for menu

and return you to the main monan prompt:

 next?
 
 

Exiting MONAN

From the main command level, the command  q or ex quits the program. At any time (and especially if a display command hangs), CTRL-C can also be entered to quit.  This should be a last resort, as your current analysis data will be lost.
 

Using monan


All transformation operations overwrite all or some of the analysis data held in memory. For this Windows version a single-level backup facility ("undo") has been added. This is applied for each command that  modifies current analysis data. The un command restores the previous data. Beyond that, the only way to retain the current data is to write it to a  file, using the wf command.  Note that the rf command (read new analysis file) overwrites the current data without warning. View options of course have no effect on the data.  The sy command can also be used to synthesize a soundfile from current analysis data (optionally applying  a number of transformations), and sp used to play it. Decisions can then be made whether or not to save the analysis data itself. Also, remember that under Windows, multiple DOS consoles can be used (and thus multiple instances of monan, or any other console program).
 

The Windows version of pvan enables analysis files to be created in ‘full’ format, which retains all the frequency components created in the analysis. monan currently does not fully support this format; instead it is converted internally into 'simple' format when loaded.
 

Common parameters


Most monan commands require extra parameters. Some parameters are common to almost all of these:

For a 16bit soundfile, peak level will be around 96dB.  Often the raw amplitudes marked on the vertical axis can be confusing, as one large number can seem very much like another. The dB plot can be more easily related to the maximum possible amplitude, which for the 16bit samples  assumed by monan will be around 96dB.

Note that the dB scale used in monan is positive from zero, thus opposite to common practice in audio equipment, where maximum level is taken as 0dB, and all sounds measured in terms of their strength relative to that. Thus a signal measured at 60dB in monan is some 36dB below peak. Halving the amplitude of the sound corresponds to a drop of approximately 6dB, so that this sound is therefore reduced in amplitude (treating the  maximum as 1.0) by a factor of 0.016.

Notation conventions


 A few symbols are used regularly:

fa : denotes the analysis frequency, equivalently equal to the centre frequency of the first harmonic (f1). This is different from the wt.ave (‘weighted average’), which takes into account frequency deviations, and which, at least for predominantly harmonic tones, is likely to indicate the true fundamental pitch.
fk :  = fa * k;  denotes the centre frequency of harmonic k ( 1 <= k <= number of harmonics)

List of commands, as displayed by the ‘lc’ command

 Display/analysis commands

ak      :    list or plot harmonic amp vs. harmonic (snapshot or average)
at       :    plot harmonic (or rms) amplitude vs. time
aa      :    plot harmonic amplitude(s) vs. another harmonic amplitude
af       :    plot all harmonic amplitudes vs. frequency on a single graph
pp      :    plot harmonic amplitude vs. harmonic no. and time (3 D plot)
br      :    plot normalized spectral centroid (BR) vs. RMS amplitude
bt      :    plot normalized spectral centroid (BR) vs. time
cr       :   plot spectral centroid (BR-1)*fa vs. RMS amplitude
ct       :   plot spectral centroid (BR-1)*fa vs. time
si       :   plot spectral irregularity vs. time
ss      :   plot amplitude-sorted spectrum at given snapshot time
fk      :   list average frequency deviations vs. harmonic k
ft       :   plot harmonic (or wt.ave fundamental) frequency vs. time
ftc     :   plot all harmonic frequencies vs. time on single graph
pt      :   plot musical pitch vs. time
lv      :    list parameter value (amp, freq) for a given time
fi      :    plot time-variant inharmonicity of a harmonic vs. Time
 

Transformation commands

na    :    normalize harmonic amplitudes to achieve constant RMS amplitude
rm   :    replace each harmonic amplitude by the rms envelope scaled
              by the average amplitude of the harmonic
sa    :   scale harmonic amplitudes to achieve a given spectrum at a certain   time
fa    :   replace frequencies with average values (fixed inharmonics)
fh    :   replace frequencies with fixed harmonic values
fc    :   replace frequencies with harmonics of the wt.ave fund. frequency
ff     :   replace frequencies with harmonics based on a particular harmonic
fv    :   replace frequencies with harmonics of a sinusoidal vibrato
rd    :   reduce duration of data by interpolating the ‘steady-state’
ri     :   reduce spectral irregularity by averaging three adjacent harmonics
rip   :   as ri, but preserving peaks
sm   :  smooth (apply low pass filter to) harmonic amplitudes or frequencies
 

 File commands

as   :    read n-segment harmonic amplitude or frequency file
ih    :    read n-segment interharmonic-amplitude-relationships file
sy   :    synthesize sound file
sp   :    play sound file
rf    :    read new analysis file
wf   :   write analysis file
lh    :   list current file name header data
mh  :   modify header data

Control commands
un   :   undo  (restore previous data)
lc    :   list command menu
re    :  toggle research mode

q or ex   :  exit the program
 

Commands in detail


ak :  lists or plots the  instantaneous or average spectrum, in various formats
 

The displayed spectrum is derived directly from the relative level of each harmonic - the levels which would need to be used by a bank of sine oscillators (see addsyn) to recreate the sound. The number of harmonics depends on the base frequency specified when the sound was analysed.
The command first asks for a time. This can be a single time in the file ("snapshot"), or, if a negative time is entered, the average spectrum for the whole file is calculated.
The command then asks whether you want  to ‘print’ (write numeric data to the screen) or ‘plot’ (create and display a graphic eps file).
For the plot option:
The Linear modes draw the horizontal axis with equal frequency steps, giving either the harmonic number (h) or the frequency (l).  In the former case, discrete vertical lines are drawn, in the manner of a bar graph, indicating the level of each harmonic. In the latter, the level values are connected horizontally, in the manner of a spectrum plot.  Where not many harmonics are involved, this will inevitably appear rather coarse, but formant regions can be seen clearly.
The log frequency mode (g) scales the horizontal axis to display pitch, i.e equal distance per octave; this can facilitate the identification of octave-related harmonics in the sound (as they will appear equally spaced), and  also has the effect of biassing the frequency resolution towards the low frequencies.
In the latter cases, start by setting a high upper limit (e.g. the Nyquist rate, half the sample rate of the source sound); if there is little or no signal in the higher frequencies, you can rerun the command selecting a lower value.


        For the print option:

 Note that if a large number of harmonics is requested,  the resulting text will scroll up the screen, so that the early values in which you may be most interested may be lost (see Using the Windows Console).
For this option, monan also prints two global values before the list of harmonics - the rms  amplitude, which will typically be higher than that of  the strongest harmonic, and  Brightness measure. Depending on whether linear or Decibel modes are selected, the rms value will be written either as raw amplitude, or as a dB value.
at :  plot harmonic (or rms) amplitude vs. time
This is one of the primary commands for viewing the shape of individual harmonics in a sound. You also have the option of viewing the ‘rms total’ envelope, which will display the overall envelope of the sound.
Parameters:


aa :  plot harmonic amplitude(s) vs. another harmonic amplitude
 

One of the more unorthodox commands in monan, the idea here is to compare the relative level of two or more harmonics by  plotting one against the other on an X/Y graph. The  trick however is to attempt also to represent time, so that the line plots the change in relationship through the sound. If the amplitude difference between the two is zero throughout (as in a pulse wave), a single dot will appear on the screen (as if pointing towards the viewer). The more usual situation is that there may be a constant non-zero difference (as in a pure wave such as a triangle wave), in which case, a straight line (at some angle) will be drawn. Deviations from the constant difference in level appear as more or less extreme digressions from this line; because of the time element the lines can also double back on themselves, or enter what appear to be chaotic regions. The more erratic the plot, the less ‘correlated’ the two harmonics are. Highly dynamic sounds will generate fairly complex, and possibly confusing, pictures. Clearly, interpreting such a graph can be as much an art as a science!
The command has two modes of operation, In the first, one harmonic is compared against another. In the second, several harmonics can be compared against one harmonic. In both cases, the comparison can alternatively be with the overall rms level. The graph can be drawn in colour, so that individual harmonics can be distinguished.

This command is also useful in conjunction with ih, when modifying individual harmonics relative to another.


af : plot all harmonic amplitudes vs. frequency on a single graph
 

This command superimposes all the amplitudes for a harmonic over time, on a single graph, unlike a conventional spectrum display, which shows the spectrum for a single moment ("snapshot"). Like the command above, 3D data is being displayed in two dimensions. The line plotted for each harmonic can thus contain many irregularities, reflecting the change in amplitude and frequency over time.
For the steady state part of a nominally pure tone, the lines may be reasonably upright (indeed, for harmonics that are truly constant, i.e. with no attack or decay stages, and with no frequency deviations, the lines may reduce to an invisible point!), but since amplitudes (and frequencies) typically change considerably in the attack and decay sections of a sound, these will appear as sometimes significant excursions from the plain vertical line.
The graphic is thus a picture of the total variation in each harmonic over the period being viewed. This is useful for identifying formant regions in a sound - resonant regions which are present throughout the sound, relatively independently of the fundamental pitch. Thus, whenever a harmonic enters that region, it will tend to be stronger. The singing voice has several distinct formants, which will show clearly (as a densely drawn region) in this display.  It is also useful for determining a threshold amplitude value for other commands (see also the ss command).
If the frequency deviations seem unreasonably large, for many harmonics (while the source is a strongly pitched sound), it may be worthwhile reanalysing the sound with a different fundamental frequency.
For this command the vertical range is always in dB.
pp : plot harmonic amplitude vs. harmonic no. and time (3 D plot)
 
For most users, this will probably be the single most used command in monan. It creates a classic 3D display of  the spectrum over time (sometimes called a ‘waterfall plot’), and can display individual partials in colour or grey-scale. In the latter case hidden line removal can be toggled on or off.
On launching pp, the user is presented with a  menu of display options. The menu includes its own quit commend, for returning to the main monan command prompt. Note that each option causes a new eps file to be created; it is easy to collect a large number of these during exploration of this command! Some commands display default values, but not all - users should note that there is no single ‘reset’ command.
pp command menu:
 1.0    Plot the graph.
 pp always starts with sensible defaults, so you can always use this first.
A display bug occasionally fires, causing GSVIEW to report an  ‘unrecoverable error - no points in view’. This is usually caused by a slight error relating to the duration of the sound. Entering an end time very slightly less than that reported (using option 4 below) is usually sufficient to avoid this bug.
 2      Toggle Harmonic vs. Frequency scale.
   Changes the legend and tick style of the frequency axis
 3     Choose Harmonic Range.
pp starts by displaying the first 20 harmonics, if available. Use this command to set the first and last harmonics to display.
 4    Choose Time Range.
 Sets the start and end times for the display.
 5    Toggle Linear/Decibel amplitude scale
Changes the vertical scale . Linear (raw amplitude valeus) emphasises peaks, Decibel displays the relative loudness of each harmonic (scaled in dB), and reveals low-level detail.
 6    Toggle amplitude normalization by rms

          Switch between a raw amplitude scale and a scale normalized to 1.0.

 7    Choose skip factor

 Smooth the display to remove detail, by skipping analysis frames (does not change the data). With long analysis files, this can save useful processing time.
 8    Toggle perspective on/off
The perspective can appear a little fierce - sometimes the image looks better with perspective turned off!
 9    Toggle line hiding on/off
With line hiding turned off, the full profile of each harmonic is visible.
This can make the image very busy, however. With line hiding on, each harmonic profile appears as a solid block of colour or monochrome shade.
10   Choose rotation angles (rotation, pitch)
Enables the 3D plot to be viewed from all angles.
Try Rotation = 240 and Pitch = 30 to see the plot from behind.
NB Pitch can be set to 90 - but this makes the harmonics edge on, so they can’t be seen!
11   Toggle colour vs. greyscale
It is well worth trying various combinations of colour, grey-scale, and line hiding. The monochrome option is naturally best for b/w printing. Note that with Line hiding turned off, and colour selected, the background is black.
12 -- Special graphing options
This option uses its own command menu, for changing the position of the picture on the screen, and changing the position of axis labels. Many of these commands (especially the axis rescaling commands)  can cause bad eps files to be written (sometimes of huge size - lots of disk activity!), and are best avoided in this Windows version.  If you find this is happening, and the disk activity does not stop, use CTRL-C to exit the program, and delete the useless eps file.
1. Manual Size Factor for Graph
The image can safely be reduced (factors below 1.0), but factors above 1.0 are best avoided.
 2. Reposition Graph On Screen
 Small moves are safe (e.g. to co-ordinates 10 10).
 3. Move Numerical Frequency Labels
 4. Move HARMONIC or FREQUENCY Axis Label
 5. Move Numerical Time Labels
 6. Move TIME Label
Small moves are safe, but are best avoided; the default settings used by monan work very well in this Windows version.
 7. Change Amplitude Scaling
The Amplitude scaling is automatic by default, and is best left that way!
 8. Amplitude Axis Scaling Options
These options have not been fully tested under Windows, and are best avoided.
 9. Print Special Features Menu
 The menu displays the current settings.
10. Exit Special Features
 Returns to the pp main menu.
13 -- List settings
Lists all graphics settings as controlled by the main pp menu and the Special graphing options menu.
14 -- Display this menu
 What it says!
q -- Exit PP command
Return to monan main menu. It is easy to forget to do this, and enter a monan command instead!
br :  plot normalized spectral centroid (BR) vs. RMS amplitude
bt :  plot normalized spectral centroid (BR) vs. time
cr :  plot spectral centroid (BR-1.0)*fa vs. RMS amplitude
ct :  plot spectral centroid (BR-1.0)*fa vs. time
 
These four commands, which form a clear group, are concerned with the display of the timbral centre of gravity, so to speak, of the sound, which in turn gives an indication of the perceived brightness of the sound. In some technical literature this frequency point, referred to as the ‘spectral centroid’, is indicated graphically by a pivot-like marker.
While mostly of interest to researchers studying timbre and sound perception, the spectral centroid value is also useful to composers, as it identifies the dominant frequency region in the sound. If you do want to reduce the brightness of the sound with a filter such as a parametric equaliser, the spectral centroid gives a good guide to the centre frequency to set for the filter.
For many sounds, there is not a one-to-one relationship between intensity and brightness. For example, a piano tone may be bright at the start, but also relatively bright during the decay. The RMS plot commands (br and cr) can often reveal this strikingly, as the plotted line can fold over upon itself - as with many such displays in monan, three dimensions of information are being represented on a 2D graph.

One especially common plot shows a dense mass of lines in one area, with two lines branching off. These correspond to the attack and decay segments (i.e. the start and end of the sound), while the concentrated block shows the steady-state segment.

 Conversely, the two time-related commands (bt and ct) present a more immediately understandable plot of the change in brightness over time - these are the best choices for non-specialists. The difference between the c~ and b~ versions is mostly in the change of vertical axis (raw frequency, or the ‘brightness scale’). The c~ versions plot BR-1.0 (see technical notes).
It is important to bear in mind that the brightness level is measured relative to the fundamental frequency (fa), rather than to the human hearing range. Sounds with a high fa may have a low brightness factor, but still sound quite bright!
It can be useful to compare these plots with those produced by the command ak.
Parameters :


si :  plot spectral irregularity vs. Time
 

‘Spectral irregularity’ is a measure of the smoothness of the spectrum. It takes three adjacent harmonics, steeping along the harmonics, and compares their average level with that of the middle one. The flatter and smoother the line, the smoother the spectrum at that point. Irregularity is often at a peak during the attack and decay stages of a sound.

It is important to observe the vertical scale, which is calculated automatically according to the limits of the data. A plot may appear irregular, but the real difference may be quite small!
To reduce spectral irregularity,  use the transformation command ri.

There are no parameters.


ss :  plot amplitude-sorted spectrum at given snapshot time
 

This command sorts the harmonics into ascending order with respect to loudness, at a nominated time (‘snapshot’). This makes it easy to identify any major change in level (e.g. between the signal components and analysis artefacts or noise), which can then be used as a threshold parameter with other commands. For a view of  all amplitudes over time, use the command af.
Parameters:


fk :  list average frequency deviations vs. harmonic k

This command does not plot a graph, but writes information to the console, or optionally (in full detail) to a text file.

For each harmonic, monan measures the frequency deviation from its expected frequency. This gives a simple statistical measure of the overall harmonicity of the sound. Where the source is a cleanly pitched sound, it also indicates the accuracy of the analysis.

The data is presented in several formats:
AFD        : Average frequency deviation (Hz)
NAFD     : Normalized Average Frequency deviation (percentage)
RMSD    : RMS frequency deviation (Hz)
NRMSD : Normalized RMS Frequency deviation (percentage)
PEAK     : Peak frequency deviation (Hz)
NPEAK  : Normalized peak frequency deviation (percentage)
These various forms follow similar measurement s made by audio engineers (though the plain average is not much used).  As a general rule of thumb, the RMS deviations (denoting the strength or persistence of the deviation) will lie roughly in the middle of the range between average and peak deviations.
In the original form of this program, all this information is written to the console. The amount of text (especially for data with a large number of harmonics) can be too great for a standard Windows console (though both Windows 95 and NT consoles can be reconfigured to contain more text - see Installing SNDAN); instead it is written to a text file, and only the AFD data is written to the screen.
The command identifies and writes the analysed fundamental pitch of the sound (corresponding to the first value of the AFD data) to the console. If the difference is significant, it may be worthwhile re-analysing the sound with the new fundamental frequency.
ft :  plot harmonic (or wt.ave fundamental) frequency vs. Time
 
One of the MONAN commands likely to be used frequently. This plots the frequency of  a nominated harmonic, or the averaged (‘weighted’) fundamental frequency, over time. The latter therefore offers an estimate of the pitch (possibly changing) of the sound. Vibrato, for example, can be clearly seen. The average is taken because many sounds, even those usually thought to be ‘harmonic’, contain inharmonic components (such as the piano, where many harmonics are sharp with respect to the fundamental); by taking the average of these, often a more accurate calculation of the true fundamental frequency can be made, than by inspecting the first harmonic alone. Vibrato tends to be synchronous across most, if not all, harmonics, so is not lost through this calculation. More random variations on the other hand will tend to get ironed out, so that their presence does not unduly affect the calculation.
In harmonic mode, ft is reasonably successful at plotting large pitch sweeps in a sound (portamento, glissando, etc).
ftc :  plot all harmonic frequencies vs. time on single graph (sonogram)
 
The sonogram plots the frequencies of all harmonics against time, representing loudness through a colour or grey-scale gradient.  Though this representation of loudness shows  relatively little detail visually (by comparison with the 3D ‘waterfall’ plot, for example), it is nevertheless vivid, and gives a satisfyingly complete view of a sound.  Together with that representation, it has understandably become one of the most popular ways of displaying the time-varying timbral characteristics of a sound.
A sonogram is computationally demanding, so unless your computer is fast (Pentium II standard or better), there can be an appreciable delay while the image is calculated.
Note that though in a sense the vertical axis represents the spectrum of the sound , the frequency profiles of the analysed harmonics are plotted, rather than the raw fft-based data more commonly used for sonogram displays. Harmonic frequency deviations are clipped to +- 0.5 * fa.
For the colour plot, white represents maximum, and purple the minimum, while black is used for data below the set threshold.
For the grey-scale plot, the opposite applies - black represents maximum.
Like ft, ftc is reasonably successful at plotting large pitch sweeps in a sound.
pt :  plot musical pitch vs. Time
 
A plot of fundamental frequency against time, similar to many others in monan, but with the vertical scales marked in terms of musical pitch (based of course on A=440Hz - see Pitch Table). Pitch deviations, in terms of semitones, are thus easily seen. The vertical scale is thus logarithmic, and is marked in semitone note names and octaves.
pt tracks the weighted fundamental frequency fa calculated for each frame. Thus it is not suitable for tracking wide variations of pitch, such as portamento, that exceed the range of fa.  The presence of an abrupt glitch or change of direction in the plot is an indicator that a different command (such as ft or ftc) may more accurately represent the characteristics of the file. However, normal vibrato, discrete pitch changes, and the small pitch fluctuations characteristic of singers and instrumentalists, will be plotted very precisely. In these cases, glitches can arise naturally at the onset and termination of a note.


lv : list parameter value (amp, freq) for a given time

A basic utility command to report the exact amplitude and frequency of a given harmonic at a given time. Frequency can be expressed as absolute (Hz), normalized deviation (percentage) or in Cents. The command repeats, so that any number of reports can be requested. Enter a negative value for time to quit the command and return to the main prompt. Amplitude is always expressed in terms of 16bit samples: peak = 32767.
  frequency:


fi :  plot time-varying inharmonicity of a harmonic vs. Time

Inharmonicity (the degree to which harmonics deviate from their centre frequency)  is measured relative to the weighted average frequency (presumed to be the true fundamental pitch), so that variations such as vibrato do not affect the calculation.  Any partials which do not track the fundamental will show up clearly. Very small differences may be due more to analysis artefacts than to the nature of the source. Inharmonicity can occur anywhere in a sound, but especially during rapid attacks.
This information can be useful not only in generally characterizing a sound, but in synthesis, where, for example, it can guide the control of the  modulation index for FM synthesis.


na : normalize harmonic amplitudes to achieve constant RMS amplitude.

The effect of this command depends largely on the dynamic character of the analysis data. It first finds the average rms amplitude of the sound, and then scales the amplitude envelope of each harmonic, frame by frame,  to create a sound  of virtually constant amplitude (corresponding to the overall average level), while mostly preserving the relative balance between harmonics. The effect can range from subtle to extreme. It can be considered an exaggerated form of compression, with quiet sections being boosted sometimes by large factors. One consequence of this is that the quantization noise (e.g at the very end of a decay tail)  that is normally almost inaudible will be raised to the level of the rest of the sound. Conversely, previously well-defined peaks will be reduced to the average level.
By following this command with rm, the data is transformed to a fixed timbre (constant amplitudes for each harmonic, as the average rms level is now constant), with the harmonic amplitudes reflecting those of the original sound. Any amplitude-based vibrato present in the original is thus eliminated. One interpretation of this process might therefore be as deconstruction, or ‘backwards synthesis’ - the new data represents the generalized timbral structure of the sound (as if created by a weighted bank of oscillators) before amplitude enveloping or filtering s applied.
Note that frequency deviations are unaffected by this command.

There are no parameters.

rm :  replace each harmonic amplitude by the rms envelope scaled by the average amplitude of the harmonic
The effect of this command is to remove time-varying timbral changes. These time-varying timbral features are often central to the identity of a sound (one thinks especially of instruments such as the sitar). The amplitude envelope of each harmonic is replaced by one, suitably scaled, that tracks the overall rms envelope. Thus, although the dynamically varying aspects of the sound have been removed, the general fixed spectrum remains, as do all individual frequency deviations. The distinctive contribution these make to the sound can thus be distinguished. See also the notes forna, above.
There are no parameters.


sa : scale harmonic amplitudes to achieve a given spectrum at a certain   time

This command scales all harmonic amplitudes according to a defined spectrum ‘template’. The template could be thought of as controlling a set of fixed volume controls, one for each harmonic. The scale factors are calculated so that the amplitudes specified in the template are matched exactly at the specified time. The result is a ‘hybrid’ sound, which retains the time-varying harmonic evolution of the original, but  with harmonic amplitudes scaled according to the template. You might determine this spectrum from examination of  some other sound, e.g using the command lv, or arbitrarily, to define some abstract spectrum to which both sounds will be scaled.  It would also be possible, based on the information obtained using lv (applied to the current sound), to apply changes to only a few harmonics, to impart formant or other colouration.
The command scales the harmonic amplitudes so that the rms level of the original sound at the nominated time is preserved. The maximum amplitude for the whole sound may thus be significantly different. If it has become louder, it may exceed the range of a 16bit sample. While this is not especially important while working within SNDAN, as all data is handled in floating-point, it clearly matters when the sound is synthesized. The sy command allows a scaling factor to be used, to bring the sound within limits. The at command can be use to find the rms level of the sound (best to use the dB option) - if the sound does not exceed 90dB no scaling factor is needed. Note that the sy option to write floating-point samples enables samples to be written to file without clipping (applies particularly to WAV format).
Added for the Windows version:

A range can be set for the first and last harmonic to modify. This enables adjustments to be made to a few harmonics (e.g. to create or remove a formant region). The prompt for amplitude values reports the current value. Note that other harmonics will still be rescaled, to reach the new target rms value.


fa : replace frequencies with average values (fixed inharmonics)

This command eliminates time-varying frequency deviations in the harmonics. It will, for example, remove pitch-based vibrato, leaving only the amplitude components, if any. It is important to note that because each frequency envelope is replaced with its constant average deviation, the new sound may well still contain significant, if fixed, inharmonicity. Although the harmonic amplitudes are unmodified, new amplitude artefacts may arise though changed or new interference patterns (beats).

There are no parameters.

fh : replace frequencies with fixed harmonic values
This command removes all harmonic deviations, and sets each harmonic to its true centre frequency (based on fa). This includes the fundamental, so that the perceived pitch may change significantly. Amplitudes are unaffected, so that much of the general timbral character will remain.
There are no parameters.
fc : replace frequencies with harmonics of the wt.ave fund. frequency
This creates a fully harmonic tone, all harmonics following the (time-varying) fundamental frequency. Amplitude data is unchanged.
 There are no parameters.


ff : replace frequencies with harmonics based on a particular harmonic.
 

This replaces each harmonic frequency envelope with envelopes that track that of the nominated harmonic. Clearly the result will be dependent on the characteristics of the source; selecting a harmonic with many frequency deviations can impart a granular or warbling quality to the sound. With sounds where some harmonics exhibit independent behaviour (e.g for some bell sounds), the new sound will be markedly different, while retaining much of the original, as amplitude envelopes are unaffected.
fv : replace frequencies with harmonics of a sinusoidal vibrato
The harmonic frequencies (deviations) are replaced with exact harmonics based on fa ( rather than on the wt.ave pitch). Harmonic amplitudes are unaffected. Note that there is no mechanism for controlling the vibrato (random features, speed changes, etc), so this command is more suited to test and experimentation tasks than for full synthesis applications.


rd  : reduce duration of data by interpolating the ‘steady-state’

This command implements time-contraction (i.e. without changing pitch) while leaving the attack and decay sections of the sound unaffected. Although it is intended primarily for use with single pitched sounds,  in practice it can be effective with a wide variety of input sounds. Two time points are required, marking the end of the attack segment and the start of the decay segment. These then define the steady-state portion of the sound, to which time-contraction is applied. To determine these times precisely, use the at command.  To increase duration (time-stretching), the programs addsyn1 or addsyn2 can be used.


ri  : reduce spectral irregularity by averaging three adjacent harmonics

See the documentation for si for a description of spectral irregularity.

The process is a special application of simple filtering, replacing the amplitude of each harmonic, frame by frame, by the average of itself and its two neighbours. As such, it can be applied multiple times for progressive smoothing of the spectrum.

Depending on the character of the source, this command have have the effect of smoothing the amplitude envelopes of harmonics, similar to the effect of sm. This is seen best using the 3D plot command pp; the ‘surface’ is smoother overall, with reduced peaks and shallowed troughs. See the variant command rip,below, for a method of avoiding this effect.
It will be natural to make continued ‘before and after’ comparisons by use of the si command (use the un command to return to the original data each time). It is important to take note of the vertical scale, as the irregularity of the plot may not seem to be much diminished, until the difference in vertical scale is taken into account!
There are no parameters.


rip : reduce spectral irregularity by averaging three adjacent harmonics - preserves peaks

Otherwise identical to ri, above, this variant attempts to preserve local peaks in the spectrum. There is no amplitude  interpolation applied when a peak is detected; consequently amplitude changes at such points can be abrupt. You can check this by using pp in Decibel mode. It will be necessary  in some cases to follow this command with sm (amplitude mode; use a cut-off between 10 and 20) to remove the glitches.

There are no parameters.


sm : smooth (apply low pass filter to) harmonic amplitudes or frequencies
 

The primary application of this command is in removing small-scale perturbations in amplitude or frequency profiles, while preserving the general shape. It is therefore reasonable to refer to the process as low-pass filtering, though it is applied in a possibly unfamiliar way; there is no progressive reduction of level applied to high-frequency components. Indeed, this command offers an excellent way of achieving broad-band noise reduction (which cannot be achieved with a time-domain filter); this is especially useful for cleaning up poor-quality samples, removing breath and bow noise, and so on.

With the exception of noise components, neither amplitude or frequency varies at a high rate in most strongly pitched sounds. Rates for vibrato and tremolo, for example, are typically between 3 and 8 Hertz. Audible beats between slightly dissonant harmonics (as in the case of some struck and plucked string tones) can exist at even lower frequencies. Thus, while experimentation is clearly necessary, depending on the nature of the sound, the cut-off frequency parameter will typically lie between 5 and 20 Hz. Higher values might be useful for removing high-frequency flutter and noise components without reducing the dynamic character too much.

Low values applied to amplitudes will smooth out attack transients, and reduce tremolo; setting a non-zero offset time will preserve the attack while processing the rest of the sound. Similarly, low values applied to frequencies will reduce, and maybe even eliminate, vibrato. However, the immediate use of extremely low settings can produce unexpected behaviour, due to the relatively simple nature of the filter.  It is better in this case to apply the filtering over two (or more) iterations, selecting a slightly lower frequency each time. For complete removal of amplitude and frequency deviations, other commands, such as rm and fa, should be used.

In addition to the pp command,  the af command is useful for observing the global effect of sm on frequency deviations.
as : read n-segment harmonic amplitude or frequency file
This command reads a text data file of  n-segment breakpoints, representing either amplitude envelopes or frequency deviations (from the centre frequency of the harmonic), for one or more harmonics. The breakpoint data for each harmonic must occupy a single line of text. Not all harmonics need to be defined, but they must be listed in ascending order. Existing data not overwritten by the new data is preserved. Thus, to completely replace the existing data, e.g. for additive synthesis, the duration of each breakpoint must correspond to the duration of the current data. Conversely, inserting new data to the middle of a single harmonic is also possible. The command rd can be used to compress the duration of existing data to suit an input data file.
The format for each harmonic is:
<harmonic no.>  <number of points> <time1 val1> <time 2 val2>….<timeN valN>
Times must be given in seconds, and must be increasing. Number of points, and end times for different harmonics, do not need to be the same. The number of points entered must match the count given in the second field. Values are raw amplitudes (0 to 32767) for amplitude files, or frequency deviations  (positive or negative) relative to the centre frequency of the harmonic (= fa * harmonic no.).
Comment lines can be used, starting with the # character.
monan requires that the maximum duration defined by the breakpoints is not greater than that of the current analysis data.
Existing analysis data is transferred to backup memory before the file is read. The command will abort if any error is found in the text file, automatically restoring the original data. Hence, any previous backup data will be lost.
Examples.
The following simple example defines three harmonics, for input as amplitude data, for harmonics 1,3 and 5, over 1.4 seconds:

#hno  npts  t1  v1  t2   v2     t3   v3     t4    v4  t5   v5
1     4     0   0   0.5  20000  1.25 20000  1.4    0
3     5     0   0   0.25 10000  1.05  3000  1.2 3000 1.4    0
5     3     0   0   0.75  2500  1.4      0

Matching data for frequency, applying a mild stretching of  the harmonic frequencies, could be as follows (the 5th harmonic lies exactly on its centre frequency):
#hno npts  t1  v1   t2   v2   t3   v3   t4    v4     t5   v5
1     5    0  0.05 0.5  0.1  1.2   0    1.3  -0.02  1.4  -0.06
3     5    0 -0.01 0.2 -0.14 0.9 -0.14  1.1  0.05   1.4   0.15
5     2    0  0    1.4  0
monan does not reject extreme deviation values ( greater than 1.0), unless the calculated frequency would exceed the Nyquist limit; these should generally be avoided.
Breakpoint formats of this kind are not very readable. The presumption underlying this command (and also ih) is that for additive synthesis, data will be generated programmatically, or will be based on modifications to existing breakpoint files. monan itself does not currently support the creation of data in breakpoint  format.
ih : read n-segment interharmonic-amplitude-relationships file
This command reads a text data file of n-segment breakpoint data defining how the amplitude profile of a source or ‘driver’ harmonic can be transferred to a target harmonic.  Unlike the format of as, in which the breakpoint pairs associate a value with a time, here two amplitude values are associated, the first relating to the driver harmonic, the second to the target harmonic. As no time component is involved, data files can, in principle, freely be applied to any analysis data, with the one proviso that lines with harmonic numbers beyond the maximum for the current data are ignored.
Each harmonic definition line creates a mapping (or ‘transfer’) function, mapping amplitudes of the driver harmonic in arbitrary ways to the new one. For example, a harmonic might duplicate low level amplitudes in the driver exactly, but higher amplitudes with an attenuation factor. This is the action of a compressor, with the significant difference that the transfer function (compression function) can be different for each harmonic, or apply only to selected harmonics.
Comment lines can be used, starting with the # character.
The first line must contain one number, giving the driver harmonic. If this is zero, the composite  rms amplitude is used. Where a harmonic is given, the original data for that will be preserved (it must not be used in a breakpoint line), and the other specified harmonics will be created from it according to the defined transfer function.
The format of each breakpoint line is as follows:
<harmonic no> <npoints> <dr_amp1 t_amp1>  <dr_amp2 t_amp2> …..
The data pairs are best understood as co-ordinates of a single line in a simple x/y plot (as implemented by the aa command). The horizontal  (x) axis represents the amplitude of the driver harmonic, and the vertical (y) axis that of the target harmonic. To create a useful function, the line should not be horizontal - this would mean that all driver values in that range map to one target value (which is equivalent to a constant amplitude).  A vertical line (where two consecutive driver amplitudes are equal) however is an error, and is rejected. Another way of expressing this is to say that there must be a unique ‘one to one’ mapping from driver amplitudes to target amplitudes.
Amplitudes assume the usual 16bit maximum of 32767. Values above this, or below zero, are rejected.
Note that breakpoint lines are applied to the analysis data as they are read. They do not need to be in ascending order of harmonic number (though this is recommended); if  a harmonic number is duplicated, the data in that line will replace any previous data.
Existing analysis data is transferred to backup memory before the file is read. The command will abort if any error is found in the text file, automatically restoring the original data. Hence, any previous backup data will be lost.
Examples.
The data file below simply copies the driver harmonic (given here as 3) to the nominated one:

#driver
3
#hno npts src1 dest1 src2 dest2
1     2    0     0  32767 32767

This is the equivalent x/y graph:
    |              o
 t  |            *
 a  |          *
 r  |        *
 g  |      *
 e  |    *
 t  |  *
    o_______________
         driver
         (fig.1)
 
The following  non-linear example copies and amplifies low-level driver amplitudes, but flattens high amplitudes. Inputs between 0 and 5000 are mapped to a larger range 0 to 15000, while input levels between 15000 and maximum are mapped to the restricted range 15000 to 20000:
#transfer amplified and compressed levels from harmonic 3 to harmonic 1
3
#hno npts src1 dest1 src2 dest2 src3  dest3
1     3    0     0   5000 15000 32767 20000
         |
         |             o
      t  |          *
      a  |     o
      r  |    *
      g  |   *
      e  |  *
      t  | *
         o_________________
             driver
(fig.2)
If  dest2 was made equal to dest3 (both 15000,say), all source values above that would be mapped to the same output level, creating  quasi-clipped regions in the new harmonic.
sy : synthesize sound file
Synthesize the current analysis data to the named file. There are several options for format and transformation; all have defaults derived from the header.


sp : play sound file

Play a soundfile using the Windows default player. See Installing SNDAN for more details. While external players and editors can be used at any time, this command enables sounds to be auditioned without having to quit monan.


rf : read new analysis file
 

Load a new .an format file, replacing the current data, which is transferred to the undo buffer.  As the undo buffer is only one level deep, you will probably find you want to reload the same file repeatedly, as you explore transformations.


wf : write analysis file

Write current analysis data to a file in the ‘compact’ .an format.
This is the only place within the SNDAN suite of programs where an analysis file can be created, other than during  the analysis process using pvan or mqan. It is especially valuable as a means of saving intermediate work, given that the undo facility (see un) is only one level deep. There is the opportunity to limit the number of harmonic written to the file.


un : single-level undo: restore previous analysis data

Prior to any operation which alters the analysis data (all transformation command and file input commands), the data and header information is saved to an undo buffer. This command restores that data to the working buffer, effectively undoing the last command. It facilitates rapid trial and error for a given command, and reduces, while not eliminating, the need to reread  the analysis file. There is no ‘redo’ facility.
There are no parameters.


lh : list current file name header data

 Prints selected header information to the screen.
 There are no parameters


mh : modify header data
 

Lists all header data derived from the original .head file created with mkheader, together with the sample rate. As each element is displayed, you have the opportunity to change it, e.g to reflect new transformations to the analysis data. The ‘comments’ field is especially relevant here. Note that much of the text information in the header is used to annotate the graphics display; this command enables you to adjust (or even eliminate) these annotations. To bypass any element without changing it, type <Enter>.


lc  : list command menu
 

 lists all commands supported by this version of monan, as shown here.


re : toggle research mode

This activates extra parameters for customizing certain aspects of the graphics display - mainly the horizontal and vertical axis limits, tick marks, and so on. This command is chiefly used by researchers, who need to customize the  graphics for inclusion in papers and documentation.  This is best used with care; the defaults employed by monan work very well in most cases. Note that it is not supported by all monan commands.
[Top][SNDAN home page]
 
 

Technical Notes, Spectral Centroid.

There is a number of ways the spectral centroid can be calculated.

The most basic  is the ‘normalized’ form:

BR  =  SUM k*A[k]/SUM A[k]

where k = harmonic number (frequency = fundamental * k)
A = amplitude of harmonic k.
The minimum value of BR here will be 1.

A common alternative form is:
fc  =  SUM f[k]*A[k]/SUM A[k]  =  SUM k*fa*A[k]/SUM A[k] = fa*BR
which says that the centroid frequency is BR * the fundamental frequency. Thus, the value for BR indicates approximately the location of the brightest harmonic (which is not necessarily the loudest - brightest is relative to frequency).


The (BR-1.0) variant was added by James Beauchamp for synthesis work, where it was more convenient to have BR with a minimum value of zero.

[Top]
[SNDAN Home page]