Paper: | SP-L4.2 | ||
Session: | Higher-Level Knowledge in Speaker Recognition | ||
Time: | Wednesday, May 19, 15:50 - 16:10 | ||
Presentation: | Lecture | ||
Topic: | Speech Processing: Speaker Recognition | ||
Title: | USING HAAR TRANSFORMED VOCAL SOURCE INFORMATION FOR AUTOMATIC SPEAKER RECOGNITION | ||
Authors: | Nengheng Zheng; Chinese University of Hong Kong | ||
P. C. Ching; Chinese University of Hong Kong | |||
Abstract: | This paper attempts to investigate the effectiveness of incorporating vocal source information for enhancing automatic speaker recognition accuracy. We propose a new method to extract discriminative features from the linear prediction (LP) residual signal, which are closely related to the glottal excitation of individual speaker. A complementary parameter set in addition to the commonly used linear predictive cepstral coefficients (LPCC), called Haar Octave Coefficients of Residue (HOCOR), is obtained by applying Haar transform to the LP residue. This additional feature vector retains the spectro-temporal characteristics of the source excitation sequences that are related to the fundamental frequency, harmonics as well as their phases. Experimental evaluation over the YOHO corpus demonstrates the high speaker discriminative power and high inter-speaker variability of HOCOR. Speaker recognition tests with both vocal tract feature (LPCC) and vocal source information (HOCOR) outperform the conventional methods of using LPCC only. | ||
Back |
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops