Paper: | SP-P6.11 | ||
Session: | Feature Analysis for ASR, TTS, and Verification | ||
Time: | Wednesday, May 19, 09:30 - 11:30 | ||
Presentation: | Poster | ||
Topic: | Speech Processing: Feature Extraction | ||
Title: | FEATURE GENERATION BASED ON MAXIMUM NORMALIZED ACOUSTIC LIKELIHOOD FOR IMPROVED SPEECH RECOGNITION | ||
Authors: | Xiang Li; Carnegie Mellon University | ||
Richard Stern; Carnegie Mellon University | |||
Abstract: | Feature representation is a very important factor that has a great effect on the performance of speech recognition systems. In this paper we focus on a feature generation process that is based on the linear transformation of an original log-spectral representation. While conventional linear feature generation methods generally use objective functions that are not closely related to recognition accuracy, our linear feature generation method attempts to find a transformation matrix that maximizes the normalized acoustic likelihood of the most likely state training data, a measure that is directly related to the classification error rate in speech recognition. The transformation matrix is generated using a gradient ascent optimization process, with the normalized acoustic likelihood of the most likely state sequence as the objective function. Experimental results using the DARPA RM corpus show that the proposed method consistently decreases word error rates compared to conventional linear feature generation methods. | ||
Back |
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops