一种新的基于稀疏表示的单通道盲源分离算法

doi:10.11999/JEIT160888

摘要
图/表
参考文献(20)
相关文章 (15)

全文: PDF (696 KB)
输出: BibTeX | EndNote (RIS)

摘要

该文针对稀疏表示应用于单通道盲源分离中存在字典间互干扰的问题，通过在常规联合字典中引入一个新的子字典 “共同子字典”，提出一种新的基于稀疏表示的单通道盲源分离算法。新的字典学习目标函数中单个源的保真度由对应子字典和共同子字典构成，共同子字典的存在可以有效避免某一源信号在其他子字典上寻求成份而带来的互干扰问题。目标函数的求解通过交替执行稀疏表示、字典更新和比例系数优化3个步骤来实现。在测试阶段，通过收集单个源所对应子字典和共同子字典上的分量可以估计出混合信号中的单个源信号，从而达到盲源分离的目的。在语音数据库上进行的对比实验发现，所提算法较传统算法和前沿算法在两个通用评价指标上最高有近1 dB的提高。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	田元荣*
	王星
	周一鹏

关键词 ：稀疏表示, 单通道盲源分离, 字典学习, 鉴别力, 保真度

Abstract：

The main drawback of sparse representation based Single Channel Blind Source Separation (SCBSS) is the interference between sub-dictionaries. To alleviate this drawback, an extra sub-dictionary, named common sub-dictionary, is proposed to add into traditional union dictionary. The single source is reconstructed by linear combining sparsely activity atoms of its corresponding sub-dictionary and common sub-dictionary. The common sub-dictionary can pure discriminative information in each source’s specified sub-dictionary since the common information different sources shared together is gathered in common sub-dictionary. The optimization of objective function involves three steps: sparse representation, dictionary updating and weight coefficients optimization, the three steps are iteratively performed for a specified number of times or until convergence. In test stage, single source separation is achieved by combining atoms in source corresponding sub-dictionary and common sub-dictionary with the sparse coefficients of single mixed signal over union dictionary. Experimental results on speech dataset show that, when compared with traditional and state of art algorithms, the proposed algorithm can improve the performance 1 dB at most.

Key words： Sparse representation Single channel blind source separation Dictionary learning Discrimination Fidelity

收稿日期: 2016-09-02 出版日期: 2017-03-21

PACS:

TP391

基金资助:

国家自然科学基金(61372167)，航空科学基金(20152096019)

通讯作者: 田元荣：男，1989年生，博士生，研究方向为稀疏表示理论与信号分离. E-mail: yrtian_mail@126.com

作者简介: 田元荣：男，1989年生，博士生，研究方向为稀疏表示理论与信号分离. 王星：男，1965年生，博士，教授，研究方向为电子对抗理论与技术. 周一鹏：男，1992年生，硕博连读生，研究方向为电子侦察与信号处理技术.

引用本文:

田元荣*,王星,周一鹏. 一种新的基于稀疏表示的单通道盲源分离算法[J]. 电子与信息学报, 2017, 39(6): 1371-1378. TIAN Yuanrong, WANG Xing, ZHOU Yipeng. Novel Single Channel Blind Source Separation Algorithm Based on Sparse Representation. JEIT, 2017, 39(6): 1371-1378.

链接本文:

http://jeit.ie.ac.cn/CN/10.11999/JEIT160888 或 http://jeit.ie.ac.cn/CN/Y2017/V39/I6/1371

[1]	VANEPH A, MCNEIL E, RIGAUD F, et al. An automated source separation technology and its practical applications[C]. Audio Engineering Society Convention 140. Audio Engineering Society, Paris, France, 2016: 181-182.
[2]	杜健, 巩克现, 葛临东. 基于单路定时准确的低复杂度成对载波复用多址信号盲分离算法[J]. 电子与信息学报, 2014, 36(8): 1872-1877. doi: 10.3724/SP.J.1146.2013.01459.
	DU Jian, GONG Kexian, and GE Lindong. Low complexity algorithm on blind separation of paired carrier multiple access signals based on single way timing accuracy[J]. Journal of Electronics & Information Technology, 2014, 36(8): 1872-1877. doi: 10.3724/SP.J.1146.2013.01459.
[3]	LOPEZ A R, ONO N, REMES U, et al. Designing multichannel source separation based on single-channel source separation[C]. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Brisbane, Australia, 2015: 469-473. doi: 10.1109/ICASSP. 2015.7178013.
[4]	吴迪, 陶智, 张晓俊, 等. 感知听觉场景分析的说话人识别[J]. 声学学报, 2016, 41(2): 260-272. doi: 10.15949/j.cnki.0371- 0025.2016.02.015.
	WU Di, TAO Rui, and ZHANG Xiaojun, et al. Perception auditory scene analysis for speaker recognition[J]. Acta Acustica, 2016, 41(2): 260-272. doi: 10.15949/j.cnki.0371- 0025.2016.02.015.
[5]	杨立东, 王晶, 谢湘, 等. 基于低秩张量补全的多声道音频信号恢复方法[J]. 电子与信息学报, 2016, 38(2): 394-399. doi: 10.11999/JEIT150589.
	YANG Lidong, WANG Jing, and XIE Xiang, et al. Low rank tensor completion for recovering missing data in multi-channel audio signal[J]. Journal of Electronics & Information Technology, 2016, 38(2): 394-399. doi: 10.11999 /JEIT150589.
[6]	JANG G J, LEE T W, and OH Y H. Single-channel signal separation using time-domain basis functions[J]. IEEE Signal Processing Letters, 2003, 10(6): 168-171. doi: 10.1109/LSP. 2003.811630.
[7]	王钢, 孙斌. 盲信号分离技术及算法研究[J]. 航天电子对抗, 2015, 31(4): 53-56. doi: 10.16328/j.htdz8511.2015.04.015.
	WANG Gang and SUN Bin. Research on blind signal separation technology and algorithm[J]. Aerospace Electronic Warfare, 2015, 31(4): 53-56. doi: 10.16328/j.htdz8511.2015. 04.015.
[8]	SCHMIDT M N and OLSSON R K. Single-channel speech separation using sparse non-negative matrix factorization[C]. ISCA International Conference on Spoken Language Proceesing, (INTERSPEECH), Pittsburgh, Pennsylvania, 2006: 2614-2617.
[9]	KING B J and ATLAS L. Single-channel source separation using complex matrix factorization[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(8): 2591-2597. doi: 10.1109/TASL.2011.2156786.
[10]	GRAIS E M and ERDOGAN H. Single channel speech music separation using nonnegative matrix factorization with sliding window and spectral masks[C]. Annual Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011: 1773-1776.
[11]	GRAIS E M and ERDOGAN H. Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation[C]. INTERSPEECH, Lyon, France, 2013: 808-812.
[12]	WENINGER F, LE Roux J, HERSHEY J R, et al. Discriminative NMF and its application to single-channel source separation[C]. Annual Conference of the International Speech Communication Association (INTERSPEECH), Singapore, 2014: 865-869.
[13]	BAO G, XU Y, and YE Z. Learning a discriminative dictionary for single-channel speech separation[J]. IEEE/ ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(7): 1130-1138. doi: 10.1109/TASLP. 2014.2320575.
[14]	WANG Z and SHA F. Discriminative non-negative matrix factorization for single-channel speech separation[C]. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 3749-3753. doi: 10.1109/ICASSP.2014.6854302.
[15]	张春梅, 尹忠科, 肖明霞. 基于冗余字典的信号超完备表示与稀疏分解[J]. 科学通报, 2006, 51(6): 628-633.
	ZHANG Chunmei, YIN Zhongke, and XIAO Mingxia. Signal over-complete representation and sparse decomposition based on redundant dictionary[J]. Chinese Science Bulletin, 2006, 51(6): 628-633.
[16]	AHARON M, ELAD M, and BRUCKSTEIN A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. doi: 10.1109/TSP.2006.881199.
[17]	COOKE M, BARKER J, CUNNINGHAM S, et al. An audio-visual corpus for speech perception and automatic speech recognition[J]. The Journal of the Acoustical Society of America, 2006, 120(5): 2421-2424. doi: 10.1121/1.2229005.
[18]	VINCENT E, GRIBONVAL R, and FEVOTTE C. Performance measurement in blind audio source separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1462-1469. doi: 10.1109/TSA.2005. 858005.
[19]	THOMAS S, SAON G, KUO H, et al. The IBM BOLT speech transcription system[C]. Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015: 3150-3153.
[20]	NORRIS D, MCQUEEN J M, and CUTLER A. Prediction, Bayesian inference and feedback in speech recognition[J]. Language, Cognition and Neuroscience, 2016, 31(1): 4-18. doi: 10.1080/23273798.2015.1081703.