Technical Program

Paper Detail

Paper:	SP-P9.7
Session:	Topics in Speech Synthesis
Time:	Wednesday, May 19, 15:30 - 17:30
Presentation:	Poster
Topic:	Speech Processing: Speech Synthesis (including TTS)
Title:	A REAL-TIME CANTONESE TEXT-TO-AUDIOVISUAL SPEECH SYNTHESIZER
Authors:	Jian-Qing Wang; Chinese University of Hong Kong
	Ka-Ho Wong; Chinese University of Hong Kong
	Pheng-Ann Heng; Chinese University of Hong Kong
	Helen Meng; Chinese University of Hong Kong
	Tien-Tsin Wong; Chinese University of Hong Kong
Abstract:	This paper describes the design and development of a Cantonese TTVS synthesizer, which can generate highly natural synthetic speech that is precisely time-synchronized with a real-time 3D face rendering. Our Cantonese TTVS synthesizer utilizes a homegrown Cantonese syllable-based concatenative text-to-speech system named CU VOCAL. This paper describes the extension of CU VOCAL to output syllable labels and durations that correspond to the output acoustic wave file. The syllables are decomposed and their initials/finals mapped to their nearest IPA symbols that correspond to static viseme models. We have authored sixteen static viseme models together with two emotion-based face models. In order to achieve 3D face rendering, we have designed and implemented a blending technique that computes the linear combinations of the static face models to effect smooth transitions in between models. We demonstrate that this design and implementation of a TTVS synthesizer can achieve real-time performance in generation.

Back

Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004