Technical Program

Paper Detail

Paper:	SP-P2.4
Session:	Speaker Adaptation
Time:	Tuesday, May 18, 13:00 - 15:00
Presentation:	Poster
Topic:	Speech Processing: Adaptation/Normalization
Title:	A STUDY OF VARIOUS COMPOSITE KERNELS FOR KERNEL EIGENVOICE SPEAKER ADAPTATION
Authors:	Brian Mak; Hong Kong University of Science and Technology
	James Kwok; Hong Kong University of Science and Technology
	Simon Ho; Hong Kong University of Science and Technology
Abstract:	Eigenvoice-based methods have been shown to be effective for fast speaker adaptation when the amount of adaptation data is small, say, less than 10 seconds. In traditional eigenvoice (EV) speaker adaptation, linear principal component analysis (PCA) is used to derive the eigenvoices. Recently, we proposed that eigenvoices found by nonlinear kernel PCA could be more effective, and the eigenvoices thus derived were called kernel eigenvoices} (KEV). One of our novelties is the use of composite kernel that makes it possible to compute state observation likelihoods via kernel functions. In this paper, we investigate two different composite kernels: direct sum kernel and tensor product kernel for KEV adaptation. In an evaluation on the TIDIGITS task, it is found that KEV speaker adaptation using both forms of composite kernel are equally effective, and they outperform a speaker-independent model and the adapted models from EV, MAP, or MLLR adaptation using 2.1s and 4.1s of speech. For example, with 2.1s of adaptation data, KEV adaptation outperforms the speaker-independent model by 27.5%, whereas EV, MAP, or MLLR adaptation are not effective at all.

Back

Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004