基于分类特征空间高斯混合模型和神经网络融合的说话人识别

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1161 KB)
输出: BibTeX | EndNote (RIS) 背景资料

摘要该文提出了一种基于分类高斯混合模型和神经网络融合(FS-GMM/NN)的说话人识别方法，通过对特征矢量进行聚类分析，将说话人的训练语音分成若干类。然后根据各个类中含特征矢量的多少采用不同的模型混合度，训练建立分类高斯混合模型。并采用神经网络实现各个分类高斯混合模型输出的融合。在100个男性话者的与文本无关的说话人识别实验中，基于分类高斯混合模型和神经网络融合的方法在识别性能及噪声鲁棒性上都优于不分类的GMM识别系统，并具有较高的模型训练效率，且可以有效地降低话者模型的混合度和测试语音长度。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	黄伟
	戴蓓蒨
	李辉

关键词 ：说话人识别, 分类特征空间, 高斯混合模型, 神经网络融合

Abstract：In this paper, a speaker identification system is proposed based on classify Fea-ture Sub-space Gaussian Mixture Model and Neural Net fusion (FS-GMM/NN) . With clus-tering analysis of the feature vectors, the speaker’s training feature vectors can be classified to some subsets and training classify Gaussian Mixture Models (GMM) with different mix-tures according to the subset’s feature vectors’s number. Finally, the outputs of every classify GMM will be fused by Neural Net (NN). In the experiment of text-independent speaker iden-tification of 100 speakers (male), the system based on FS-GMM/NN overmatch the Baseline Gaussian Mixture Model (B-GMM) in identification performance and noise robustness with fewer mixtures and shorter test speech. Moreover, the training of FS-GMM/NN is more effective.

Key words： Speaker identification Classified feature-subspace GMM Neural Net（NN）fusion

收稿日期: 2003-05-16

PACS:

TP391.42

引用本文:

黄伟; 戴蓓蒨; 李辉. 基于分类特征空间高斯混合模型和神经网络融合的说话人识别[J]. 电子与信息学报, 2004, 26(10): 1607-1612 . Huang Wei; Dai Bei-qian; Li Hui. Speaker Identification Based on Classify Feature Sub-space Gaussian Mixture Model and Neural Net Fusion. , 2004, 26(10): 1607-1612 .

链接本文:

http://jeit.ie.ac.cn/CN/ 或 http://jeit.ie.ac.cn/CN/Y2004/V26/I10/1607