Pitch Shifting with the VLC and VLR Codecs

Tone generator

The use of a tone generator during the decompression allows a very fine control of the frequencies (double precision), as with the magnitudes and the phases.
The tones can be generated in parallel, so the generation is compatible with the GPU programming.
The very fine control of the frequencies allows to do frequency shifts in real time with very few additional operations.
This powerful feature can be used especially in case of profound hearing loss in some frequency region, during the voice communications.





Custom control of frequencies

It is planned to use an XML text file to define custom settings for the frequencies that you want to control:
   - Shift ratio (for pitch shifts).
   - Shift threshold (no shift below the threshold, in Hz).
   - Shift offset (for addition, in Hz).
   - Width of a band of frequencies (for composition, in Hz).

Composition

With the composition, the shift ratio and the offset are applied to a band whose frequencies are added to the original frequencies.
Set the width of the band to zero if no composition or to a strictly positive value for a composition.

Noise Reduction
One can complete and use the bands of the background to reduce the noise contained in a signal.
More Information

The pshift.exe utility

On Windows, starting from the version 7 of the codecs, an utility (pshift.exe) is provided. This utility allows to test the functions of pitch shifts.
From a WAVE file, it encodes and then decodes with the selected codec. During the decompression, it applies the selected settings for the shifts. The play.exe utility or an other WAVE file player can be used to read the output file.
The pshift.exe utility allows to know the parameters that correspond to your own needs. Run the utility without parameter to have the list of all the available parameters.

Auto-Tune feature (tone control)

It is planned to add a parameter ("note") to the pshift.exe utility.
If note=0, the shift ratio parameter is active (default).
If note=-1, the shift ratio parameter is not used. The fundamental frequency of the frame (if any) is moved to the nearest musical note. The same shift ratio applied to the fundamental frequency is applied to the other frequencies.
If note=value, the shift ratio parameter is not used. The fundamental frequency of the frame (if any) is moved to the musical note equals to "value". The same shift ratio applied to the fundamental frequency is applied to the other frequencies.

Notes

The tone generator can be used for all the codecs.
The calculations increase with the number of tones to generate and the size of audio buffers. The GPU programming can be used if necessary.
By changing the value of a simple variable, the tone generator can be enabled or disabled for a codec.
The decimation and the points selection algorithms in the frequency domain are useful to significantly reduce the calculations.
The iFFT (inverse FFT) related properties as the fast convolutions or the fast 3D are no longer applicable. Furthermore, the HQ codecs (VLC HQ 16 and VLC HQ 48) are no longer quasi-lossless in energy, with the LTAS (Long-Term Average Spectrum) definition.

Without the support of GPU programming (which is planned), the calculations increase substantially for codecs using the sampling rates above 16 kHz (VLC 32, VLC 48, ...).
We use iFFT for those codecs. This causes a loss of precision for the modified frequencies (the frequencies become multiples of the width of the FFT bins).
However, these losses are insignificant for sufficiently high frequencies.

If the shift ratio is negative, the frequencies become smaller.
If the shift ratio is positive, the frequencies become larger.
If the shift ratio is zero, there is no change.
The shift ratio is given by the formula:
   r = 2 ^ (s / 12) where:
      s = number of semitones.
      ^ = power.

The VLC HQ 48 codec seeks the greatest points up to 24 kHz. Even if one has no hearing problem, from 16 kHz, the frequencies are becoming less and less audible.
The pitch shifting allows to hear the higher frequencies the best possible. One can for example choose a threshold between 12 and 16 kHz and a shift ratio between -1 and -12.
For a threshold of 16 kHz and a shift ratio of -12:
   - 16 kHz becomes 8 kHz.
   - 20 kHz becomes 10 kHz.
   - 24 kHz becomes 12 kHz.

The points taken into account for the compression constitute the spectral envelope modulating the default magnitudes of the shifted points.

The threshold should be greater than 1000-1500 Hz to avoid undesirable distortions.
Set threshold=0 to prevent the modulation of the magnitudes with the spectral envelope and for more sophisticated algorithms (if you have non voice sounds or if you want audio effects, through a slight bufferisation).

The pshift.exe utility supports the uncompressed PCM codecs (pcm8, pcm16, pcm32 and pcm48, for the 8, 16, 32 and 48 kHz sampling rates).
For treatments, the FFT and inverse FFT transformations are performed on the decoder side.





Listening Section

   
Original Male Voice
16 kHz Sampling Rate
Click Here to Listen WAV   MP3
Original Male Voice
8 kHz Sampling Rate
Click Here to Listen WAV   MP3
Original Female Voice
16 kHz Sampling Rate
Click Here to Listen WAV   MP3
Original Female Voice
8 kHz Sampling Rate
Click Here to Listen WAV   MP3
   
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
   
Shift of frequencies above 1500 Hz with a ratio of -2 semitones
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
Shift of frequencies above 1500 Hz with a ratio of -2 semitones
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
   
Shift of frequencies above 1500 Hz with a ratio of -4 semitones
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
Shift of frequencies above 1500 Hz with a ratio of -4 semitones
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
   
Shift of frequencies above 1500 Hz with ratio=-8 semitones and offset=-100 Hz
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
Shift of frequencies above 1500 Hz with ratio=-8 semitones and offset=-100 Hz
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
   
Composition
Superposition of frequencies above 2000 Hz and below 3000 Hz with ratio=-3 semitones and offset=-500 Hz
(composition width=1000 Hz)
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3
Composition
Superposition of frequencies above 2000 Hz and below 3000 Hz with ratio=-3 semitones and offset=-500 Hz
(composition width=1000 Hz)
After compression and decompression
by the VLC codec at 8000 bps
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC 8 codec at 12250 bps
Click Here to Listen WAV   MP3





   
Original Bird Sound
48 kHz Sampling Rate
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
Original Female Voice and Piano Sound
48 kHz Sampling Rate
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
   
Shift of frequencies above 1500 Hz with ratio=-8 semitones and offset=0 Hz
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
Shift of frequencies above 2000 Hz with ratio=-8 semitones and offset=0 Hz
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3


   
Original Electric Guitar Sound 1
48 kHz Sampling Rate
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
Original Electric Guitar Sound 2
48 kHz Sampling Rate
Click Here to Listen WAV   MP3
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
   
Shift of frequencies above 0 Hz with ratio=-4 semitones and offset=0 Hz
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
Shift of frequencies above 0 Hz with ratio=-4 semitones and offset=0 Hz
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
   
Shift of frequencies above 0 Hz with ratio=+4 semitones and offset=0 Hz
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3
Shift of frequencies above 0 Hz with ratio=+4 semitones and offset=0 Hz
After compression and decompression
by the VLC HQ 48 codec at 96000 bps
Click Here to Listen WAV   MP3


Infographic