Technical Program

Paper Detail

Paper:SP-P8.2
Session:Voice Activity Detection and Speech Segmentation
Time:Wednesday, May 19, 13:00 - 15:00
Presentation: Poster
Topic: Speech Processing: Feature Extraction
Title: SPEECH DISCRIMINATION BASED ON MULTISCALE SPECTRO-TEMPORAL MODULATIONS
Authors: Nima Mesgarani; University of Maryland, College Park 
 Shihab Shamma; University of Maryland, College Park 
 Malcolm Slaney; IBM Almaden Research Center 
Abstract: A novel approach for content-based audio classification is presented based on multiscale spectro-temporal modulation features extracted using a model of auditory cortex. The task is to discriminate speech from non-speech which consists of animal vocalizations, music and environmental sounds. Generalization of the system to signals in high level of additive noise andreverberation is evaluated and compared to two existing approaches. The results demonstrate the advantages of the auditory model over the other two systems, especially at low SNRs and high reverberation.
 
           Back


Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004