launching exiting
using common
parameters notation conventions
display/analysis transformation
file handling
commands in detail
monan is a comprehensive program for display, modification, and synthesis of spectrum analysis data stored in harmonic analysis files (.an files). Over 40 commands are available for displaying analysis data, applying modification and transformations, and substituting new data, together with commands for generating and playing soundfiles. Thus, in addition to its primary purpose as an analytical tool, monan offers a number of facilities for composers and sound designers. Transformations include pitch shifting and time-stretching, with a special command devoted to time compression preserving the attack and decay. Because the source code is available, C programmers can without much difficulty devise their own transformations which can either form a separate program, or be incorporated into monan. Many of the current commands have entered the program in just this way.
Input files must be in the .an format, either generated directly by
pvan,
or converted from .mq files (usually by means of mqtoan). For graphic
output, monan writes .eps files, and automatically launching the
installed associated program to display them. This will usually be
GSVIEW
- see Installing SNDAN for
more information.
monan can be started simply by typing its name in at the
Windows MSDOS console prompt. It will then ask the user for an analysis
file to load. If the file is in the current directory (the directory you
are in when you run monan) just type the filename. Otherwise, you
will need to type the full Windows path to it (see Using
the Windows Console). Alternatively, the filename can be added
directly as a command line parameter after the program name. At this stage,
if the file cannot be opened, for any reason, monan exits.
If the file is successfully opened, monan displays information about it (based on the header previously used to create the file), and asks how many harmonics to load. Just press <Enter> to load all harmonics. monan then enters the interactive command cycle. Some commands have their own sub-menus (e.g the pp command), while others may ask directly for further parameters. On completion of each command monan returns to its command prompt. Note that to use some commands, some of the information displayed when monan loads a file is useful when entering parameters, so you need to ensure you can scroll back to it, or otherwise make a note of it somehow.
Invalid commands, or just pressing <Enter> will produce the friendly message:
huh? Type ‘lc’ for menu
and return you to the main monan prompt:
next?
All transformation operations overwrite all or some of the analysis
data held in memory. For this Windows version a single-level backup facility
("undo") has been added. This is applied for each command that modifies
current analysis data. The un command restores
the previous data. Beyond that, the only way to retain the current data
is to write it to a file, using the wf
command. Note that the rf command
(read new analysis file) overwrites the current data without warning. View
options of course have no effect on the data. The sy
command can also be used to synthesize a soundfile from current analysis
data (optionally applying a number of transformations), and sp
used to play it. Decisions can then be made whether or not to save the
analysis data itself. Also, remember that under Windows, multiple DOS consoles
can be used (and thus multiple instances of monan, or any other
console program).
The Windows version of pvan enables analysis
files to be created in ‘full’ format, which retains all the frequency components
created in the analysis. monan currently does not fully support
this format; instead it is converted internally into 'simple' format when
loaded.
Most monan commands require extra parameters. Some parameters
are common to almost all of these:
For a 16bit soundfile, peak level will be around 96dB. Often the raw amplitudes marked on the vertical axis can be confusing, as one large number can seem very much like another. The dB plot can be more easily related to the maximum possible amplitude, which for the 16bit samples assumed by monan will be around 96dB.Note that the dB scale used in monan is positive from zero, thus opposite to common practice in audio equipment, where maximum level is taken as 0dB, and all sounds measured in terms of their strength relative to that. Thus a signal measured at 60dB in monan is some 36dB below peak. Halving the amplitude of the sound corresponds to a drop of approximately 6dB, so that this sound is therefore reduced in amplitude (treating the maximum as 1.0) by a factor of 0.016.
A few symbols are used regularly:
fa : denotes the analysis frequency, equivalently equal to the centre frequency of the first harmonic (f1). This is different from the wt.ave (‘weighted average’), which takes into account frequency deviations, and which, at least for predominantly harmonic tones, is likely to indicate the true fundamental pitch.
fk : = fa * k; denotes the centre frequency of harmonic k ( 1 <= k <= number of harmonics)
Control commands
un : undo (restore
previous data)
lc : list command
menu
re : toggle research mode
q or ex : exit the program
ak : lists or plots the instantaneous
or average spectrum, in various formats
The displayed spectrum is derived directly from the relative level of each harmonic - the levels which would need to be used by a bank of sine oscillators (see addsyn) to recreate the sound. The number of harmonics depends on the base frequency specified when the sound was analysed.
The command first asks for a time. This can be a single time in the file ("snapshot"), or, if a negative time is entered, the average spectrum for the whole file is calculated.
The command then asks whether you want to ‘print’ (write numeric data to the screen) or ‘plot’ (create and display a graphic eps file).
For the plot option:
The Linear modes draw the horizontal axis with equal frequency steps, giving either the harmonic number (h) or the frequency (l). In the former case, discrete vertical lines are drawn, in the manner of a bar graph, indicating the level of each harmonic. In the latter, the level values are connected horizontally, in the manner of a spectrum plot. Where not many harmonics are involved, this will inevitably appear rather coarse, but formant regions can be seen clearly.
The log frequency mode (g) scales the horizontal axis to display pitch, i.e equal distance per octave; this can facilitate the identification of octave-related harmonics in the sound (as they will appear equally spaced), and also has the effect of biassing the frequency resolution towards the low frequencies.
In the latter cases, start by setting a high upper limit (e.g. the Nyquist rate, half the sample rate of the source sound); if there is little or no signal in the higher frequencies, you can rerun the command selecting a lower value.
For the print option:
Note that if a large number of harmonics is requested, the resulting text will scroll up the screen, so that the early values in which you may be most interested may be lost (see Using the Windows Console).at : plot harmonic (or rms) amplitude vs. time
For this option, monan also prints two global values before the list of harmonics - the rms amplitude, which will typically be higher than that of the strongest harmonic, and Brightness measure. Depending on whether linear or Decibel modes are selected, the rms value will be written either as raw amplitude, or as a dB value.
This is one of the primary commands for viewing the shape of individual harmonics in a sound. You also have the option of viewing the ‘rms total’ envelope, which will display the overall envelope of the sound.
Parameters:
aa : plot harmonic amplitude(s)
vs. another harmonic amplitude
One of the more unorthodox commands in monan, the idea here is to compare the relative level of two or more harmonics by plotting one against the other on an X/Y graph. The trick however is to attempt also to represent time, so that the line plots the change in relationship through the sound. If the amplitude difference between the two is zero throughout (as in a pulse wave), a single dot will appear on the screen (as if pointing towards the viewer). The more usual situation is that there may be a constant non-zero difference (as in a pure wave such as a triangle wave), in which case, a straight line (at some angle) will be drawn. Deviations from the constant difference in level appear as more or less extreme digressions from this line; because of the time element the lines can also double back on themselves, or enter what appear to be chaotic regions. The more erratic the plot, the less ‘correlated’ the two harmonics are. Highly dynamic sounds will generate fairly complex, and possibly confusing, pictures. Clearly, interpreting such a graph can be as much an art as a science!
The command has two modes of operation, In the first, one harmonic is compared against another. In the second, several harmonics can be compared against one harmonic. In both cases, the comparison can alternatively be with the overall rms level. The graph can be drawn in colour, so that individual harmonics can be distinguished.This command is also useful in conjunction with ih, when modifying individual harmonics relative to another.
af : plot all harmonic amplitudes vs.
frequency on a single graph
This command superimposes all the amplitudes for a harmonic over time, on a single graph, unlike a conventional spectrum display, which shows the spectrum for a single moment ("snapshot"). Like the command above, 3D data is being displayed in two dimensions. The line plotted for each harmonic can thus contain many irregularities, reflecting the change in amplitude and frequency over time.
For the steady state part of a nominally pure tone, the lines may be reasonably upright (indeed, for harmonics that are truly constant, i.e. with no attack or decay stages, and with no frequency deviations, the lines may reduce to an invisible point!), but since amplitudes (and frequencies) typically change considerably in the attack and decay sections of a sound, these will appear as sometimes significant excursions from the plain vertical line.
The graphic is thus a picture of the total variation in each harmonic over the period being viewed. This is useful for identifying formant regions in a sound - resonant regions which are present throughout the sound, relatively independently of the fundamental pitch. Thus, whenever a harmonic enters that region, it will tend to be stronger. The singing voice has several distinct formants, which will show clearly (as a densely drawn region) in this display. It is also useful for determining a threshold amplitude value for other commands (see also the ss command).
If the frequency deviations seem unreasonably large, for many harmonics (while the source is a strongly pitched sound), it may be worthwhile reanalysing the sound with a different fundamental frequency.
For this command the vertical range is always in dB.
For most users, this will probably be the single most used command in monan. It creates a classic 3D display of the spectrum over time (sometimes called a ‘waterfall plot’), and can display individual partials in colour or grey-scale. In the latter case hidden line removal can be toggled on or off.
On launching pp, the user is presented with a menu of display options. The menu includes its own quit commend, for returning to the main monan command prompt. Note that each option causes a new eps file to be created; it is easy to collect a large number of these during exploration of this command! Some commands display default values, but not all - users should note that there is no single ‘reset’ command.
pp command menu:
1.0 Plot the graph.pp always starts with sensible defaults, so you can always use this first.2 Toggle Harmonic vs. Frequency scale.
A display bug occasionally fires, causing GSVIEW to report an ‘unrecoverable error - no points in view’. This is usually caused by a slight error relating to the duration of the sound. Entering an end time very slightly less than that reported (using option 4 below) is usually sufficient to avoid this bug.Changes the legend and tick style of the frequency axis3 Choose Harmonic Range.pp starts by displaying the first 20 harmonics, if available. Use this command to set the first and last harmonics to display.4 Choose Time Range.Sets the start and end times for the display.5 Toggle Linear/Decibel amplitude scaleChanges the vertical scale . Linear (raw amplitude valeus) emphasises peaks, Decibel displays the relative loudness of each harmonic (scaled in dB), and reveals low-level detail.6 Toggle amplitude normalization by rmsSwitch between a raw amplitude scale and a scale normalized to 1.0.
7 Choose skip factor
Smooth the display to remove detail, by skipping analysis frames (does not change the data). With long analysis files, this can save useful processing time.8 Toggle perspective on/offThe perspective can appear a little fierce - sometimes the image looks better with perspective turned off!9 Toggle line hiding on/offWith line hiding turned off, the full profile of each harmonic is visible.10 Choose rotation angles (rotation, pitch)
This can make the image very busy, however. With line hiding on, each harmonic profile appears as a solid block of colour or monochrome shade.Enables the 3D plot to be viewed from all angles.11 Toggle colour vs. greyscale
Try Rotation = 240 and Pitch = 30 to see the plot from behind.
NB Pitch can be set to 90 - but this makes the harmonics edge on, so they can’t be seen!It is well worth trying various combinations of colour, grey-scale, and line hiding. The monochrome option is naturally best for b/w printing. Note that with Line hiding turned off, and colour selected, the background is black.12 -- Special graphing optionsThis option uses its own command menu, for changing the position of the picture on the screen, and changing the position of axis labels. Many of these commands (especially the axis rescaling commands) can cause bad eps files to be written (sometimes of huge size - lots of disk activity!), and are best avoided in this Windows version. If you find this is happening, and the disk activity does not stop, use CTRL-C to exit the program, and delete the useless eps file.
br : plot normalized spectral centroid (BR) vs. RMS amplitude1. Manual Size Factor for Graph13 -- List settingsThe image can safely be reduced (factors below 1.0), but factors above 1.0 are best avoided.2. Reposition Graph On ScreenSmall moves are safe (e.g. to co-ordinates 10 10).3. Move Numerical Frequency Labels
4. Move HARMONIC or FREQUENCY Axis Label
5. Move Numerical Time Labels
6. Move TIME LabelSmall moves are safe, but are best avoided; the default settings used by monan work very well in this Windows version.7. Change Amplitude ScalingThe Amplitude scaling is automatic by default, and is best left that way!8. Amplitude Axis Scaling OptionsThese options have not been fully tested under Windows, and are best avoided.9. Print Special Features MenuThe menu displays the current settings.10. Exit Special FeaturesReturns to the pp main menu.Lists all graphics settings as controlled by the main pp menu and the Special graphing options menu.14 -- Display this menuWhat it says!q -- Exit PP commandReturn to monan main menu. It is easy to forget to do this, and enter a monan command instead!
These four commands, which form a clear group, are concerned with the display of the timbral centre of gravity, so to speak, of the sound, which in turn gives an indication of the perceived brightness of the sound. In some technical literature this frequency point, referred to as the ‘spectral centroid’, is indicated graphically by a pivot-like marker.
While mostly of interest to researchers studying timbre and sound perception, the spectral centroid value is also useful to composers, as it identifies the dominant frequency region in the sound. If you do want to reduce the brightness of the sound with a filter such as a parametric equaliser, the spectral centroid gives a good guide to the centre frequency to set for the filter.
For many sounds, there is not a one-to-one relationship between intensity and brightness. For example, a piano tone may be bright at the start, but also relatively bright during the decay. The RMS plot commands (br and cr) can often reveal this strikingly, as the plotted line can fold over upon itself - as with many such displays in monan, three dimensions of information are being represented on a 2D graph.One especially common plot shows a dense mass of lines in one area, with two lines branching off. These correspond to the attack and decay segments (i.e. the start and end of the sound), while the concentrated block shows the steady-state segment.
Conversely, the two time-related commands (bt and ct) present a more immediately understandable plot of the change in brightness over time - these are the best choices for non-specialists. The difference between the c~ and b~ versions is mostly in the change of vertical axis (raw frequency, or the ‘brightness scale’). The c~ versions plot BR-1.0 (see technical notes).
It is important to bear in mind that the brightness level is measured relative to the fundamental frequency (fa), rather than to the human hearing range. Sounds with a high fa may have a low brightness factor, but still sound quite bright!
It can be useful to compare these plots with those produced by the command ak.
Parameters :
- time range
- (br,ct only) number of harmonics
This is a useful feature of these commands, as for many sounds most of the significant energy is in the lower harmonics, while the upper ones may mostly add noise to the display.- (ct only) Amplitude threshold (default = 100)
Eliminate low level harmonics from the calculation. Note that 16bit samples are assumed internally, so that the default of 100 represents a normalized amplitude of 100/32767 ~= .003, or -50dB.- skip factor.
si : plot spectral irregularity
vs. Time
‘Spectral irregularity’ is a measure of the smoothness of the spectrum. It takes three adjacent harmonics, steeping along the harmonics, and compares their average level with that of the middle one. The flatter and smoother the line, the smoother the spectrum at that point. Irregularity is often at a peak during the attack and decay stages of a sound.It is important to observe the vertical scale, which is calculated automatically according to the limits of the data. A plot may appear irregular, but the real difference may be quite small!
To reduce spectral irregularity, use the transformation command ri.
There are no parameters.
ss : plot amplitude-sorted spectrum
at given snapshot time
This command sorts the harmonics into ascending order with respect to loudness, at a nominated time (‘snapshot’). This makes it easy to identify any major change in level (e.g. between the signal components and analysis artefacts or noise), which can then be used as a threshold parameter with other commands. For a view of all amplitudes over time, use the command af.
Parameters:
- time range
fk : list average frequency deviations
vs. harmonic k
This command does not plot a graph, but writes information to the console, or optionally (in full detail) to a text file.For each harmonic, monan measures the frequency deviation from its expected frequency. This gives a simple statistical measure of the overall harmonicity of the sound. Where the source is a cleanly pitched sound, it also indicates the accuracy of the analysis.
The data is presented in several formats:
AFD : Average frequency deviation (Hz)
NAFD : Normalized Average Frequency deviation (percentage)
RMSD : RMS frequency deviation (Hz)
NRMSD : Normalized RMS Frequency deviation (percentage)
PEAK : Peak frequency deviation (Hz)
NPEAK : Normalized peak frequency deviation (percentage)
These various forms follow similar measurement s made by audio engineers (though the plain average is not much used). As a general rule of thumb, the RMS deviations (denoting the strength or persistence of the deviation) will lie roughly in the middle of the range between average and peak deviations.
In the original form of this program, all this information is written to the console. The amount of text (especially for data with a large number of harmonics) can be too great for a standard Windows console (though both Windows 95 and NT consoles can be reconfigured to contain more text - see Installing SNDAN); instead it is written to a text file, and only the AFD data is written to the screen.
The command identifies and writes the analysed fundamental pitch of the sound (corresponding to the first value of the AFD data) to the console. If the difference is significant, it may be worthwhile re-analysing the sound with the new fundamental frequency.ft : plot harmonic (or wt.ave fundamental) frequency vs. Time
One of the MONAN commands likely to be used frequently. This plots the frequency of a nominated harmonic, or the averaged (‘weighted’) fundamental frequency, over time. The latter therefore offers an estimate of the pitch (possibly changing) of the sound. Vibrato, for example, can be clearly seen. The average is taken because many sounds, even those usually thought to be ‘harmonic’, contain inharmonic components (such as the piano, where many harmonics are sharp with respect to the fundamental); by taking the average of these, often a more accurate calculation of the true fundamental frequency can be made, than by inspecting the first harmonic alone. Vibrato tends to be synchronous across most, if not all, harmonics, so is not lost through this calculation. More random variations on the other hand will tend to get ironed out, so that their presence does not unduly affect the calculation.
In harmonic mode, ft is reasonably successful at plotting large pitch sweeps in a sound (portamento, glissando, etc).
‘f’ : set a range comfortably below and above the
expected frequency, fk.
‘ n’: use 1.0 unless you know the deviation is likely to
be small.
‘c’ : enter the maximum deviation expected. 1200 Cents
represents one octave; most deviations will be comfortably less than this.
Most vibrato is less than one semitone up or down, so a range of 100 may
be sufficient.
The sonogram plots the frequencies of all harmonics against time, representing loudness through a colour or grey-scale gradient. Though this representation of loudness shows relatively little detail visually (by comparison with the 3D ‘waterfall’ plot, for example), it is nevertheless vivid, and gives a satisfyingly complete view of a sound. Together with that representation, it has understandably become one of the most popular ways of displaying the time-varying timbral characteristics of a sound.
A sonogram is computationally demanding, so unless your computer is fast (Pentium II standard or better), there can be an appreciable delay while the image is calculated.
Note that though in a sense the vertical axis represents the spectrum of the sound , the frequency profiles of the analysed harmonics are plotted, rather than the raw fft-based data more commonly used for sonogram displays. Harmonic frequency deviations are clipped to +- 0.5 * fa.
For the colour plot, white represents maximum, and purple the minimum, while black is used for data below the set threshold.
For the grey-scale plot, the opposite applies - black represents maximum.
Like ft, ftc is reasonably successful at plotting large pitch sweeps in a sound.
A plot of fundamental frequency against time, similar to many others in monan, but with the vertical scales marked in terms of musical pitch (based of course on A=440Hz - see Pitch Table). Pitch deviations, in terms of semitones, are thus easily seen. The vertical scale is thus logarithmic, and is marked in semitone note names and octaves.
pt tracks the weighted fundamental frequency fa calculated for each frame. Thus it is not suitable for tracking wide variations of pitch, such as portamento, that exceed the range of fa. The presence of an abrupt glitch or change of direction in the plot is an indicator that a different command (such as ft or ftc) may more accurately represent the characteristics of the file. However, normal vibrato, discrete pitch changes, and the small pitch fluctuations characteristic of singers and instrumentalists, will be plotted very precisely. In these cases, glitches can arise naturally at the onset and termination of a note.
lv : list parameter value (amp, freq)
for a given time
A basic utility command to report the exact amplitude and frequency of a given harmonic at a given time. Frequency can be expressed as absolute (Hz), normalized deviation (percentage) or in Cents. The command repeats, so that any number of reports can be requested. Enter a negative value for time to quit the command and return to the main prompt. Amplitude is always expressed in terms of 16bit samples: peak = 32767.
frequency:
- mode (Hz , normalized deviation, Cents deviation)
Interpretation of the deviation modes requires that you know what the centre frequency of the requested partial is. This will be the harmonic number * fa.
fi : plot time-varying inharmonicity
of a harmonic vs. Time
Inharmonicity (the degree to which harmonics deviate from their centre frequency) is measured relative to the weighted average frequency (presumed to be the true fundamental pitch), so that variations such as vibrato do not affect the calculation. Any partials which do not track the fundamental will show up clearly. Very small differences may be due more to analysis artefacts than to the nature of the source. Inharmonicity can occur anywhere in a sound, but especially during rapid attacks.
This information can be useful not only in generally characterizing a sound, but in synthesis, where, for example, it can guide the control of the modulation index for FM synthesis.
na : normalize harmonic amplitudes to
achieve constant RMS amplitude.
The effect of this command depends largely on the dynamic character of the analysis data. It first finds the average rms amplitude of the sound, and then scales the amplitude envelope of each harmonic, frame by frame, to create a sound of virtually constant amplitude (corresponding to the overall average level), while mostly preserving the relative balance between harmonics. The effect can range from subtle to extreme. It can be considered an exaggerated form of compression, with quiet sections being boosted sometimes by large factors. One consequence of this is that the quantization noise (e.g at the very end of a decay tail) that is normally almost inaudible will be raised to the level of the rest of the sound. Conversely, previously well-defined peaks will be reduced to the average level.
By following this command with rm, the data is transformed to a fixed timbre (constant amplitudes for each harmonic, as the average rms level is now constant), with the harmonic amplitudes reflecting those of the original sound. Any amplitude-based vibrato present in the original is thus eliminated. One interpretation of this process might therefore be as deconstruction, or ‘backwards synthesis’ - the new data represents the generalized timbral structure of the sound (as if created by a weighted bank of oscillators) before amplitude enveloping or filtering s applied.
Note that frequency deviations are unaffected by this command.rm : replace each harmonic amplitude by the rms envelope scaled by the average amplitude of the harmonicThere are no parameters.
The effect of this command is to remove time-varying timbral changes. These time-varying timbral features are often central to the identity of a sound (one thinks especially of instruments such as the sitar). The amplitude envelope of each harmonic is replaced by one, suitably scaled, that tracks the overall rms envelope. Thus, although the dynamically varying aspects of the sound have been removed, the general fixed spectrum remains, as do all individual frequency deviations. The distinctive contribution these make to the sound can thus be distinguished. See also the notes forna, above.
There are no parameters.
sa : scale harmonic amplitudes to achieve
a given spectrum at a certain time
This command scales all harmonic amplitudes according to a defined spectrum ‘template’. The template could be thought of as controlling a set of fixed volume controls, one for each harmonic. The scale factors are calculated so that the amplitudes specified in the template are matched exactly at the specified time. The result is a ‘hybrid’ sound, which retains the time-varying harmonic evolution of the original, but with harmonic amplitudes scaled according to the template. You might determine this spectrum from examination of some other sound, e.g using the command lv, or arbitrarily, to define some abstract spectrum to which both sounds will be scaled. It would also be possible, based on the information obtained using lv (applied to the current sound), to apply changes to only a few harmonics, to impart formant or other colouration.
The command scales the harmonic amplitudes so that the rms level of the original sound at the nominated time is preserved. The maximum amplitude for the whole sound may thus be significantly different. If it has become louder, it may exceed the range of a 16bit sample. While this is not especially important while working within SNDAN, as all data is handled in floating-point, it clearly matters when the sound is synthesized. The sy command allows a scaling factor to be used, to bring the sound within limits. The at command can be use to find the rms level of the sound (best to use the dB option) - if the sound does not exceed 90dB no scaling factor is needed. Note that the sy option to write floating-point samples enables samples to be written to file without clipping (applies particularly to WAV format).
Added for the Windows version:A range can be set for the first and last harmonic to modify. This enables adjustments to be made to a few harmonics (e.g. to create or remove a formant region). The prompt for amplitude values reports the current value. Note that other harmonics will still be rescaled, to reach the new target rms value.
fa : replace frequencies with average
values (fixed inharmonics)
This command eliminates time-varying frequency deviations in the harmonics. It will, for example, remove pitch-based vibrato, leaving only the amplitude components, if any. It is important to note that because each frequency envelope is replaced with its constant average deviation, the new sound may well still contain significant, if fixed, inharmonicity. Although the harmonic amplitudes are unmodified, new amplitude artefacts may arise though changed or new interference patterns (beats).fh : replace frequencies with fixed harmonic valuesThere are no parameters.
This command removes all harmonic deviations, and sets each harmonic to its true centre frequency (based on fa). This includes the fundamental, so that the perceived pitch may change significantly. Amplitudes are unaffected, so that much of the general timbral character will remain.
There are no parameters.fc : replace frequencies with harmonics of the wt.ave fund. frequency
This creates a fully harmonic tone, all harmonics following the (time-varying) fundamental frequency. Amplitude data is unchanged.
There are no parameters.
ff : replace frequencies with harmonics
based on a particular harmonic.
This replaces each harmonic frequency envelope with envelopes that track that of the nominated harmonic. Clearly the result will be dependent on the characteristics of the source; selecting a harmonic with many frequency deviations can impart a granular or warbling quality to the sound. With sounds where some harmonics exhibit independent behaviour (e.g for some bell sounds), the new sound will be markedly different, while retaining much of the original, as amplitude envelopes are unaffected.
The harmonic frequencies (deviations) are replaced with exact harmonics based on fa ( rather than on the wt.ave pitch). Harmonic amplitudes are unaffected. Note that there is no mechanism for controlling the vibrato (random features, speed changes, etc), so this command is more suited to test and experimentation tasks than for full synthesis applications.
rd : reduce duration of data by
interpolating the ‘steady-state’
This command implements time-contraction (i.e. without changing pitch) while leaving the attack and decay sections of the sound unaffected. Although it is intended primarily for use with single pitched sounds, in practice it can be effective with a wide variety of input sounds. Two time points are required, marking the end of the attack segment and the start of the decay segment. These then define the steady-state portion of the sound, to which time-contraction is applied. To determine these times precisely, use the at command. To increase duration (time-stretching), the programs addsyn1 or addsyn2 can be used.
ri : reduce spectral irregularity
by averaging three adjacent harmonics
See the documentation for si for a description of spectral irregularity.The process is a special application of simple filtering, replacing the amplitude of each harmonic, frame by frame, by the average of itself and its two neighbours. As such, it can be applied multiple times for progressive smoothing of the spectrum.
Depending on the character of the source, this command have have the effect of smoothing the amplitude envelopes of harmonics, similar to the effect of sm. This is seen best using the 3D plot command pp; the ‘surface’ is smoother overall, with reduced peaks and shallowed troughs. See the variant command rip,below, for a method of avoiding this effect.
It will be natural to make continued ‘before and after’ comparisons by use of the si command (use the un command to return to the original data each time). It is important to take note of the vertical scale, as the irregularity of the plot may not seem to be much diminished, until the difference in vertical scale is taken into account!
There are no parameters.
rip : reduce spectral irregularity by
averaging three adjacent harmonics - preserves peaks
Otherwise identical to ri, above, this variant attempts to preserve local peaks in the spectrum. There is no amplitude interpolation applied when a peak is detected; consequently amplitude changes at such points can be abrupt. You can check this by using pp in Decibel mode. It will be necessary in some cases to follow this command with sm (amplitude mode; use a cut-off between 10 and 20) to remove the glitches.There are no parameters.
sm : smooth (apply low pass filter to)
harmonic amplitudes or frequencies
The primary application of this command is in removing small-scale perturbations in amplitude or frequency profiles, while preserving the general shape. It is therefore reasonable to refer to the process as low-pass filtering, though it is applied in a possibly unfamiliar way; there is no progressive reduction of level applied to high-frequency components. Indeed, this command offers an excellent way of achieving broad-band noise reduction (which cannot be achieved with a time-domain filter); this is especially useful for cleaning up poor-quality samples, removing breath and bow noise, and so on.With the exception of noise components, neither amplitude or frequency varies at a high rate in most strongly pitched sounds. Rates for vibrato and tremolo, for example, are typically between 3 and 8 Hertz. Audible beats between slightly dissonant harmonics (as in the case of some struck and plucked string tones) can exist at even lower frequencies. Thus, while experimentation is clearly necessary, depending on the nature of the sound, the cut-off frequency parameter will typically lie between 5 and 20 Hz. Higher values might be useful for removing high-frequency flutter and noise components without reducing the dynamic character too much.
Low values applied to amplitudes will smooth out attack transients, and reduce tremolo; setting a non-zero offset time will preserve the attack while processing the rest of the sound. Similarly, low values applied to frequencies will reduce, and maybe even eliminate, vibrato. However, the immediate use of extremely low settings can produce unexpected behaviour, due to the relatively simple nature of the filter. It is better in this case to apply the filtering over two (or more) iterations, selecting a slightly lower frequency each time. For complete removal of amplitude and frequency deviations, other commands, such as rm and fa, should be used.
In addition to the pp command, the af command is useful for observing the global effect of sm on frequency deviations.
This command reads a text data file of n-segment breakpoints, representing either amplitude envelopes or frequency deviations (from the centre frequency of the harmonic), for one or more harmonics. The breakpoint data for each harmonic must occupy a single line of text. Not all harmonics need to be defined, but they must be listed in ascending order. Existing data not overwritten by the new data is preserved. Thus, to completely replace the existing data, e.g. for additive synthesis, the duration of each breakpoint must correspond to the duration of the current data. Conversely, inserting new data to the middle of a single harmonic is also possible. The command rd can be used to compress the duration of existing data to suit an input data file.
The format for each harmonic is:
<harmonic no.> <number of points> <time1 val1> <time 2 val2>….<timeN valN>
Times must be given in seconds, and must be increasing. Number of points, and end times for different harmonics, do not need to be the same. The number of points entered must match the count given in the second field. Values are raw amplitudes (0 to 32767) for amplitude files, or frequency deviations (positive or negative) relative to the centre frequency of the harmonic (= fa * harmonic no.).
Comment lines can be used, starting with the # character.
monan requires that the maximum duration defined by the breakpoints is not greater than that of the current analysis data.
Existing analysis data is transferred to backup memory before the file is read. The command will abort if any error is found in the text file, automatically restoring the original data. Hence, any previous backup data will be lost.
Examples.
The following simple example defines three harmonics, for input as amplitude data, for harmonics 1,3 and 5, over 1.4 seconds:#hno npts t1 v1 t2 v2 t3 v3 t4 v4 t5 v5
1 4 0 0 0.5 20000 1.25 20000 1.4 0
3 5 0 0 0.25 10000 1.05 3000 1.2 3000 1.4 0
5 3 0 0 0.75 2500 1.4 0
Matching data for frequency, applying a mild stretching of the harmonic frequencies, could be as follows (the 5th harmonic lies exactly on its centre frequency):
#hno npts t1 v1 t2 v2 t3 v3 t4 v4 t5 v5
1 5 0 0.05 0.5 0.1 1.2 0 1.3 -0.02 1.4 -0.06
3 5 0 -0.01 0.2 -0.14 0.9 -0.14 1.1 0.05 1.4 0.15
5 2 0 0 1.4 0
monan does not reject extreme deviation values ( greater than 1.0), unless the calculated frequency would exceed the Nyquist limit; these should generally be avoided.
Breakpoint formats of this kind are not very readable. The presumption underlying this command (and also ih) is that for additive synthesis, data will be generated programmatically, or will be based on modifications to existing breakpoint files. monan itself does not currently support the creation of data in breakpoint format.
This command reads a text data file of n-segment breakpoint data defining how the amplitude profile of a source or ‘driver’ harmonic can be transferred to a target harmonic. Unlike the format of as, in which the breakpoint pairs associate a value with a time, here two amplitude values are associated, the first relating to the driver harmonic, the second to the target harmonic. As no time component is involved, data files can, in principle, freely be applied to any analysis data, with the one proviso that lines with harmonic numbers beyond the maximum for the current data are ignored.
Each harmonic definition line creates a mapping (or ‘transfer’) function, mapping amplitudes of the driver harmonic in arbitrary ways to the new one. For example, a harmonic might duplicate low level amplitudes in the driver exactly, but higher amplitudes with an attenuation factor. This is the action of a compressor, with the significant difference that the transfer function (compression function) can be different for each harmonic, or apply only to selected harmonics.
Comment lines can be used, starting with the # character.
The first line must contain one number, giving the driver harmonic. If this is zero, the composite rms amplitude is used. Where a harmonic is given, the original data for that will be preserved (it must not be used in a breakpoint line), and the other specified harmonics will be created from it according to the defined transfer function.
The format of each breakpoint line is as follows:
<harmonic no> <npoints> <dr_amp1 t_amp1> <dr_amp2 t_amp2> …..
The data pairs are best understood as co-ordinates of a single line in a simple x/y plot (as implemented by the aa command). The horizontal (x) axis represents the amplitude of the driver harmonic, and the vertical (y) axis that of the target harmonic. To create a useful function, the line should not be horizontal - this would mean that all driver values in that range map to one target value (which is equivalent to a constant amplitude). A vertical line (where two consecutive driver amplitudes are equal) however is an error, and is rejected. Another way of expressing this is to say that there must be a unique ‘one to one’ mapping from driver amplitudes to target amplitudes.
Amplitudes assume the usual 16bit maximum of 32767. Values above this, or below zero, are rejected.
Note that breakpoint lines are applied to the analysis data as they are read. They do not need to be in ascending order of harmonic number (though this is recommended); if a harmonic number is duplicated, the data in that line will replace any previous data.
Existing analysis data is transferred to backup memory before the file is read. The command will abort if any error is found in the text file, automatically restoring the original data. Hence, any previous backup data will be lost.
Examples.
The data file below simply copies the driver harmonic (given here as 3) to the nominated one:#driver
3
#hno npts src1 dest1 src2 dest2
1 2 0 0 32767 32767
This is the equivalent x/y graph:| o
The following non-linear example copies and amplifies low-level driver amplitudes, but flattens high amplitudes. Inputs between 0 and 5000 are mapped to a larger range 0 to 15000, while input levels between 15000 and maximum are mapped to the restricted range 15000 to 20000:
#transfer amplified and compressed levels from harmonic 3 to harmonic 1
3
#hno npts src1 dest1 src2 dest2 src3 dest3
1 3 0 0 5000 15000 32767 20000
|
| o
t | *
a | o
r | *
g | *
e | *
t | *
o_________________
driver(fig.2)
If dest2 was made equal to dest3 (both 15000,say), all source values above that would be mapped to the same output level, creating quasi-clipped regions in the new harmonic.
Synthesize the current analysis data to the named file. There are several options for format and transformation; all have defaults derived from the header.
Play a soundfile using the Windows default player. See Installing SNDAN for more details. While external players and editors can be used at any time, this command enables sounds to be auditioned without having to quit monan.
Load a new .an format file, replacing the current data, which is transferred to the undo buffer. As the undo buffer is only one level deep, you will probably find you want to reload the same file repeatedly, as you explore transformations.
Write current analysis data to a file in the ‘compact’ .an format.
This is the only place within the SNDAN suite of programs where an analysis file can be created, other than during the analysis process using pvan or mqan. It is especially valuable as a means of saving intermediate work, given that the undo facility (see un) is only one level deep. There is the opportunity to limit the number of harmonic written to the file.
un : single-level undo: restore previous
analysis data
Prior to any operation which alters the analysis data (all transformation command and file input commands), the data and header information is saved to an undo buffer. This command restores that data to the working buffer, effectively undoing the last command. It facilitates rapid trial and error for a given command, and reduces, while not eliminating, the need to reread the analysis file. There is no ‘redo’ facility.
There are no parameters.
lh : list current file name header data
Prints selected header information to the screen.
There are no parameters
Lists all header data derived from the original .head file created with mkheader, together with the sample rate. As each element is displayed, you have the opportunity to change it, e.g to reflect new transformations to the analysis data. The ‘comments’ field is especially relevant here. Note that much of the text information in the header is used to annotate the graphics display; this command enables you to adjust (or even eliminate) these annotations. To bypass any element without changing it, type <Enter>.
lists all commands supported by this version of monan, as shown here.
This activates extra parameters for customizing certain aspects of the graphics display - mainly the horizontal and vertical axis limits, tick marks, and so on. This command is chiefly used by researchers, who need to customize the graphics for inclusion in papers and documentation. This is best used with care; the defaults employed by monan work very well in most cases. Note that it is not supported by all monan commands.[Top][SNDAN home page]
Technical Notes, Spectral Centroid.
There is a number of ways the spectral centroid can be calculated.
The most basic is the ‘normalized’ form:
BR = SUM k*A[k]/SUM A[k]A common alternative form is:where k = harmonic number (frequency = fundamental * k)
A = amplitude of harmonic k.
The minimum value of BR here will be 1.
fc = SUM f[k]*A[k]/SUM A[k] = SUM k*fa*A[k]/SUM A[k] = fa*BR
which says that the centroid frequency is BR * the fundamental frequency. Thus, the value for BR indicates approximately the location of the brightest harmonic (which is not necessarily the loudest - brightest is relative to frequency).
The (BR-1.0) variant was added by James Beauchamp for synthesis
work, where it was more convenient to have BR with a minimum value of zero.
[Top]
[SNDAN Home page]