Technical Program

Paper Detail

Paper:SP-P11.13
Session:Topics in Large Vocabulary Continuous Speech Recognition
Time:Thursday, May 20, 09:30 - 11:30
Presentation: Poster
Topic: Speech Processing: Large Vocabulary Recognition/Search
Title: AN EVALUATION OF A NONLINEAR FEATURE TRANSFORMATION FOR CONVERSATIONAL SPEECH RECOGNITION
Authors: Mohamed Omar; University of Illinois at Urbana-Champaign 
 Brian Kingsbury; IBM T. J. Watson Research Center 
Abstract: We test the nonlinear symplectic maximum-likelihood transformation (SMLT) on two large-vocabulary, conversational speech recognition tasks: IBM's Superhuman test and the DARPA 2003 Rich Transcription (RT03) test. Features in these tests are computed via linear discriminant analysis (LDA) on spliced MFCC features and subsequent transformation of the projected features using either a maximum-likelihood linear transformation (MLLT), an SMLT, or both. In contrast to previous tests of the SMLT on TIMIT phone recognition withstatic and delta MFCCs, these tests use a more difficult task and very different features. The four results of this work are that both LDA+MLLT and LDA+SMLT systems outperform an LDA-only system; the LDA+MLLT system outperforms the LDA+SMLT system (but the MLLT has 20 times more parameters than the SMLT); small improvements over an LDA+MLLT system are obtained with an LDA+MLLT+SMLT system on well-matched material; and no improvements are obtained using two class-dependent SMLTs in an LDA+MLLT+SMLT system.
 
           Back


Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004