Technical Program

Paper Detail

Paper:SP-L7.6
Session:Quantization Techniques in Speech Coding
Time:Thursday, May 20, 17:10 - 17:30
Presentation: Lecture
Topic: Speech Processing: Speech Coding
Title: WAVEFORM QUANTIZATION OF SPEECH USING GAUSSIAN MIXTURE MODELS
Authors: Jonas Samuelsson; Royal Institute of Technology (KTH) 
Abstract: Waveform quantization of speech using Gaussian mixture models (GMMs) is proposed. GMMs are trained directly on the speech waveform, and high dimensional vector quantizers (VQs) that efficiently exploit the redundancy are constructed based on the GMM parameters. Two types of GMMs are studied. The complexity of the scheme is independent of the rate, and the rate can be changed without retraining the VQ. A shape-gain structure improves performance and robustness. Pre- and post-processing using spectral amplitude warping further improves perceptual quality. A 32-dimensional VQ operating at 2 bits/sample reproduces speech sampled at 8 kHz with a PESQ score of 4.2.
 
           Back


Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004