Technical Program

Paper Detail

Session:Pitch and Tone Based Speech Analysis
Time:Thursday, May 20, 09:30 - 09:50
Presentation: Lecture
Topic: Speech Processing: Speech Analysis
Authors: Xu Shao; University of East Anglia 
 Ben Milner; University of East Anglia 
Abstract: This work proposes a technique for reconstructing an acoustic speech signal solely from a stream of mel-frequency cepstral coefficients (MFCCs). Previous speech reconstruction methods have required an additional pitch element, but this work proposes two maximum a posteriori (MAP) methods for predicting pitch from the MFCC vectors themselves. The first method is based on a Gaussian mixture model (GMM) while the second scheme utilises the temporal correlation available from a hidden Markov model (HMM) framework. A formal measurement of both frame classification accuracy and RMS pitch error shows that an HMM-based scheme with 5 clusters per state is able to correctly classify over 94% of frames and has an RMS pitch error of 3.1Hz in comparison to a reference pitch. Informal listening tests and analysis of spectrograms reveals that speech reconstructed solely from the MFCC vectors is almost indistinguishable from that using the reference pitch.

Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: -||- Last updated Wednesday, April 07, 2004