Paper: | SP-L7.6 | ||
Session: | Quantization Techniques in Speech Coding | ||
Time: | Thursday, May 20, 17:10 - 17:30 | ||
Presentation: | Lecture | ||
Topic: | Speech Processing: Speech Coding | ||
Title: | WAVEFORM QUANTIZATION OF SPEECH USING GAUSSIAN MIXTURE MODELS | ||
Authors: | Jonas Samuelsson; Royal Institute of Technology (KTH) | ||
Abstract: | Waveform quantization of speech using Gaussian mixture models (GMMs) is proposed. GMMs are trained directly on the speech waveform, and high dimensional vector quantizers (VQs) that efficiently exploit the redundancy are constructed based on the GMM parameters. Two types of GMMs are studied. The complexity of the scheme is independent of the rate, and the rate can be changed without retraining the VQ. A shape-gain structure improves performance and robustness. Pre- and post-processing using spectral amplitude warping further improves perceptual quality. A 32-dimensional VQ operating at 2 bits/sample reproduces speech sampled at 8 kHz with a PESQ score of 4.2. | ||
Back |
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops