Paper: | SP-L8.5 | ||
Session: | Acoustic Modeling: New Search Features and Supervised Training | ||
Time: | Friday, May 21, 10:50 - 11:10 | ||
Presentation: | Lecture | ||
Topic: | Speech Processing: Acoustic Modeling for Speech Recognition | ||
Title: | LIGHT SUPERVISION IN ACOUSTIC MODEL TRAINING | ||
Authors: | Long Nguyen; BBN Technologies | ||
Bing Xiang; BBN Technologies | |||
Abstract: | In this paper, we present a new light supervision method to automatically derive additional acoustic training data for broadcast news transcription systems. In this method, a subset of the TDT corpus, which consists of broadcast audio with corresponding closed-caption (CC) transcripts, is identified by aligning the CC transcripts and the hypotheses generated by lightly-supervised decoding. Phrases of three or more contiguous words, that both the CC transcripts and the decoder's hypotheses agree, are selected. The selection yields 702 hours, or 72% of the captioned data. When adding 700 hours of selected data to the baseline 141-hour broadcast news training data set, we achieved a 13% relative word error rate reduction. The key to the effectiveness of this light supervision method is the use of a biased language model (LM) in the lightly supervised decoding. The biased LM, in which the CC transcripts are added with a heavy weight, helps in selecting words the recognizer could have misrecognized if using a fair LM. | ||
Back |
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops