|
|
A Robust Dynamic Mouth Feature Based on Visemic LDA for Audio Visual Speech Recognition |
Xie Lei①; Fu Zhong-hua①; Jiang Dong-mei①; Zhao Rong-chun①; Werner Verhelst②; Hichem Sahli②; Jan Conlenis② |
①School of Computer Science Northwestern Polytechnical Univ.,Xi’an 710072 China;②Dept of ETRO Free University Brussels Pleinlaan 2 B-1050 Brussels Belgium |
|
|
Abstract This paper presents a robust visual feature based on Visemic LDA for audio visual speech recognition, which captures dynamic lip contour information and reflects the viseme classes of visual speech. The paper also introduces an automatic labeling method using the speech recognition results for LDA training data, which avoids the tedious manually labeling work and labeling errors. Experimental results show that the audio visual speech recognition system based on the visual features presented in this paper can greatly increase the speech recognition rate in noisy conditions. The combination of the visual feature with multi-stream HMM can bring the recognition rate of over 80% at a 10dB SNR noisy condition.
|
Received: 11 July 2003
|
|
|
|
|
|
|
|