基于汉语视频三音素的可视语音合成

doi:10.3724/SP.J.1146.2008.01634

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (342 KB)
Export: BibTeX | EndNote (RIS)

Abstract In order to synthesize real video sequence, a visual speech synthesis algorithm based on Chinese visual triphone is proposed. According to Chinese pronunciation principle and the relationship between phoneme and viseme, conception of ‘visual triphone’ is presented. Hidden Markov Model(HMM) is established based on visual triphones. In the training stage, combined features including visual features and audio features are used. In the synthesis stage, sentence HMM is constructed by concatenating triphone HMMs, from which the feature parameters are extracted. From the result of subjective and objective evaluation, the synthesized video is real and satisfied.

Key words： Visual speech synthesis Visual triphone Hidden Markov Model(HMM) Combined features

Received: 05 December 2008

PACS:

TP391.42

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors


	Zhao Hui
	Tang Chao-jing

Cite this article:

Zhao Hui,Tang Chao-jing. Visual Speech Synthesis Algorithm Based on Chinese Visual Triphone[J]. , 2009, 31(12): 3010-3014 .

URL:

http://jeit.ie.ac.cn/EN/10.3724/SP.J.1146.2008.01634 OR http://jeit.ie.ac.cn/EN/Y2009/V31/I12/3010