Technical Program

Paper Detail

Paper:SP-P16.4
Session:Speech Modeling for Robust Speech Recognition
Time:Friday, May 21, 15:30 - 17:30
Presentation: Poster
Topic: Speech Processing: Robust Speech Recognition
Title: BAYESIAN DURATION MODELING AND LEARNING FOR SPEECH RECOGNITION
Authors: Jen-Tzung Chien; National Cheng-Kung University 
 Chih-Hsien Huang; National Cheng-Kung University 
Abstract: We present the Bayesian duration modeling and learning for speech recognition under nonstationary speaking rates and noise conditions. In this study, the Gaussian, Poisson and gamma distributions are investigated to characterize duration models. The maximum a posteriori (MAP) estimate of gamma duration model is developed. To exploit the sequential learning, we adopt the Poisson duration model incorporated with gamma prior density, which belongs to the conjugate prior family. When the adaptation data are sequentially observed, the gamma posterior density is produced for twofold advantages. One is to determine the optimal quasi-Bayes (QB) duration parameter, which can be merged in HMM's for speech recognition. The other one is to build the updating mechanism of gamma prior statistics for sequential learning. EM algorithm is applied to fulfill parameter estimation. In the experiments, the proposed Bayesian approaches significantly improve the speech recognition performance of Mandarin broadcast news. The batch and sequential learning are investigated for MAP and QB duration models, respectively.
 
           Back


Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004