Paper: | SP-P11.10 | ||
Session: | Topics in Large Vocabulary Continuous Speech Recognition | ||
Time: | Thursday, May 20, 09:30 - 11:30 | ||
Presentation: | Poster | ||
Topic: | Speech Processing: Large Vocabulary Recognition/Search | ||
Title: | THE 2003 ISL RICH TRANSCRIPTION SYSTEM FOR CONVERSATIONAL TELEPHONY SPEECH | ||
Authors: | Hagen Soltau; Interactive Systems Labs | ||
Hua Yu; Interactive Systems Labs | |||
Florian Metze; Interactive Systems Labs | |||
Christian Fügen; Interactive Systems Labs | |||
Qin Jin; Interactive Systems Labs | |||
Szu-Chen Jou; Interactive Systems Labs | |||
Abstract: | This paper describes the ISL large vocabulary conversational telephony speech recognition system, which was tested in NIST's RT-03S (``Switchboard'') evaluation. We present our experiments on improving preprocessing, acoustic modelling, and language modelling. The system features phone dependent semi-tied full covariances, semi-tied clustering of septa-phones, clustering across phones, feature adaptive training, robust estimation of VTLN and MLLR, as well as context dependent interpolation of language models. We present detailed results for each stage of our multi-pass transcription scheme. System development started in 2002 with an error rate of 35.1% on our internal 1h development set. The final system performed at WER 21.8%, a 38% relative improvement. The error rate on the RT-03 CTS evaluation set is 23.4%. | ||
Back |
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops