Abstract:This paper proposes an algorithm for voice conversion based on mixtures of linear transformation which avoids the need for parallel training corpus inherent in conventional approaches. In maximum likelihood framework, the EM algorithm is used to compute the parameters of the transfer function. And the chirp Z-transform is utilized to enhance the smoothed spectral envelop due to the linear weighted averaging. The proposed voice conversion system is evaluated using both objective and subjective measures. The experiment results demonstrate that the proposed approach is capable of effectively transforming speaker identity and can achieve comparable results of the conventional methods where a parallel corpus is needed.
简志华; 杨震. 基于混合线性变换的语声转换算法[J]. 电子与信息学报, 2007, 29(7): 1700-1702 .
Jian Zhi-hua; Yang Zhen. An Algorithm for Voice Conversion Based on Mixtures of Linear Transformation. , 2007, 29(7): 1700-1702 .