Paper: | SP-P6.9 | ||
Session: | Feature Analysis for ASR, TTS, and Verification | ||
Time: | Wednesday, May 19, 09:30 - 11:30 | ||
Presentation: | Poster | ||
Topic: | Speech Processing: Feature Extraction | ||
Title: | TRAPPING CONVERSATIONAL SPEECH: EXTENDING TRAP/TANDEM APPROACHES TO CONVERSATIONAL TELEPHONE SPEECH RECOGNITION | ||
Authors: | Nelson Morgan; International Computer Science Institute / University of California Berkeley | ||
Barry Chen; International Computer Science Institute / University of California Berkeley | |||
Qifeng Zhu; International Computer Science Institute | |||
Andreas Stolcke; International Computer Science Institute / SRI International | |||
Abstract: | TempoRAl Patterns (TRAPs) and Tandem MLP/HMM approaches incorporate feature streams computed from longer time intervals than the conventional short-time analysis. These methods have been used for challenging small- and medium-vocabulary recognition tasks, such as Aurora and SPINE. Conversational telephone speech recognition is a difficult large-vocabulary task, with current systems giving incorrect output for 20-40% of the words, depending on the system complexity and test set. Training and test times for this problem also tend to be relatively long, making rapid development quite difficult. In this paper we report experiments with a reduced conversational speech task that led to the adoption of a number of engineering decisions for the design of an acoustic front end. We then describe our results with this front end on a full-vocabulary conversational telephone speech task. In both cases the front end yielded significant improvements over the baseline. | ||
Back |
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops