`phasevocoder`

phasevocoder is probably the most powerful and canonical example of sound synthesis provided currently by Marsyas. It is based on the phasevocoder implementation described by F.R.Moore in his book “Elements of Computer Music”. It is broken into individual MarSystems in a modular way and can be used for sound-file and real-time input pitch-shifting and/or time-scaling. Several variations of the algorithm proposed in the literature have been implemented and can be configured through several command-line options. Familiarity with phasevocoder terminology will help understanding their effect on the transformed sound file. Some representative examples are:

phasevocoder foo.wav -f foo_identity.wav phasevocoder foo.wav -f foo_stretched.wav -n 2048 -w 2048 -d 256 -i 512 phasevocoder foo.wav -ob -cm sorted -s 10 -p 1.5 -f foo_pitch_shifted.wav phasevocoder foo.wav -f foo_stretched.wav -n 4096 -w 4096 -d 768 -i 1024 -cm full -ucm identity_phaselock phasevocoder foo.wav -f foo_stretched.wav -n 4096 -w 4096 -d 768 -i 1024 -cm analysis_scaled_phaselock -ucm scaled_phaselock

In the first example the input file foo.wav is passed through the classic phasevocoder (overlap-add, FFT-frontend and FFT-backend) without any time or pitch modifications. The second example show how time stretching can be achieved by making the analysis hop size (-d) and the synthesis hop size (-i) different. The -n option specified the FFT size and the -w option specifies the window size. In the third example a bank of sinusoidal oscillators (-ob) is used instead of the FFT-backend and the input is pitch shifted by 1.5. The fourth example uses identity phaselocking (-ucm) and the fifth example uses scaled phaselocking (-cm and -ucm) as described by Laroche and Dobson.

- ‘
`-n --fftsize`’ - size of the fft
- ‘
`-w --winsize`’ - size of the window
- ‘
`-v --voices`’ - number of voices
- ‘
`-g --gain`’ - linear volume gain
- ‘
`-b --bufferSize`’ - audio buffer size
- ‘
`-m --midi`’ - midi input port number
- ‘
`-e --epochHeterophonics`’ - heterophonics epoch
- ‘
`-d --decimation`’ - analysis hop size (decimation)
- ‘
`-i --interpolation`’ - synthesis hop size (interpolation)
- ‘
`-p --pitchshift`’ - pitch shift factor (for example 2.0 is an octave)
- ‘
`-ob --oscbank`’ - use bank of oscillators back-end
- ‘
`-s --sinusoids`’ - number of sinusoids to use if convert mode is sorted
- ‘
`-cm --convertmode`’ - analysis front-end mode: full: use all FFT bins, sorted: sort FFT
bins by magnitude and only use s sinusoids,
analysis_scaled_phaselock: compute extra analysis info for scaled
phaselocking
- ‘
`-ucm --unconvertmode`’ - synthesis back-end mode: classic: propagate phases for all bins,
loose_phaselock: described by Puckette, identity_phaselock: pick
peaks, propagate phases for peaks and lock regions of influence
around them, scaled_phaselock: refinement that takes into account
information from the previous frame
- ‘
`-on --onsets filename_with_onsets`’ - takes as input a simple text file with locations of onsets that are used to re-initialize phases and not time stretch transient frames that contain the onsets.