Technical Program

Paper Detail

Paper:	SP-P11.5
Session:	Topics in Large Vocabulary Continuous Speech Recognition
Time:	Thursday, May 20, 09:30 - 11:30
Presentation:	Poster
Topic:	Speech Processing: Large Vocabulary Recognition/Search
Title:	GENERATING AND EVALUATING SEGMENTATIONS FOR AUTOMATIC SPEECH RECOGNITION OF CONVERSATIONAL TELEPHONE SPEECH
Authors:	Sue Tranter; Cambridge University
	Kai Yu; Cambridge University
	Gunnar Evermann; Cambridge University
	Phil Woodland; Cambridge University
Abstract:	Speech recognition systems for conversational telephone speech require the audio data to be automatically divided into regions of speech and non-speech. The quality of this audio segmentation affects the recognition accuracy. This paper describes several approaches to segmentation and compares the resulting recogniser performance. It is shown that using Gaussian Mixture Models outperforms an energy-detection method and using the output from the speech recogniser itself increases performance further.An upper bound on possible performance was obtained when deriving a segmentation from a forced alignment of the reference words and this outperformed using manually marked word times.Finally the correlation between an appropriately defined segmentation score and WER is shown to be over 0.95 across three data sets, suggesting that segmentations can be evaluated directly without the need for full decoding runs.

Back

Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004