|
|
Novel Single Channel Blind Source Separation Algorithm Based on Sparse Representation |
TIAN Yuanrong WANG Xing ZHOU Yipeng |
(Institute of Aeronautics and Astronautics Engineering, Air Force Engineering University, Xi’an 710038, China) |
|
|
Abstract The main drawback of sparse representation based Single Channel Blind Source Separation (SCBSS) is the interference between sub-dictionaries. To alleviate this drawback, an extra sub-dictionary, named common sub-dictionary, is proposed to add into traditional union dictionary. The single source is reconstructed by linear combining sparsely activity atoms of its corresponding sub-dictionary and common sub-dictionary. The common sub-dictionary can pure discriminative information in each source’s specified sub-dictionary since the common information different sources shared together is gathered in common sub-dictionary. The optimization of objective function involves three steps: sparse representation, dictionary updating and weight coefficients optimization, the three steps are iteratively performed for a specified number of times or until convergence. In test stage, single source separation is achieved by combining atoms in source corresponding sub-dictionary and common sub-dictionary with the sparse coefficients of single mixed signal over union dictionary. Experimental results on speech dataset show that, when compared with traditional and state of art algorithms, the proposed algorithm can improve the performance 1 dB at most.
|
Received: 02 September 2016
Published: 21 March 2017
|
|
Fund: The National Natural Science Foundation of China (61372167), The Aviation Science Foundation of China (20152096019) |
Corresponding Authors:
TIAN Yuanrong
E-mail: yrtian_mail@126.com
|
|
|
|
[1] |
VANEPH A, MCNEIL E, RIGAUD F, et al. An automated source separation technology and its practical applications[C]. Audio Engineering Society Convention 140. Audio Engineering Society, Paris, France, 2016: 181-182.
|
[2] |
杜健, 巩克现, 葛临东. 基于单路定时准确的低复杂度成对载波复用多址信号盲分离算法[J]. 电子与信息学报, 2014, 36(8): 1872-1877. doi: 10.3724/SP.J.1146.2013.01459.
|
|
DU Jian, GONG Kexian, and GE Lindong. Low complexity algorithm on blind separation of paired carrier multiple access signals based on single way timing accuracy[J]. Journal of Electronics & Information Technology, 2014, 36(8): 1872-1877. doi: 10.3724/SP.J.1146.2013.01459.
|
[3] |
LOPEZ A R, ONO N, REMES U, et al. Designing multichannel source separation based on single-channel source separation[C]. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Brisbane, Australia, 2015: 469-473. doi: 10.1109/ICASSP. 2015.7178013.
|
[4] |
吴迪, 陶智, 张晓俊, 等. 感知听觉场景分析的说话人识别[J]. 声学学报, 2016, 41(2): 260-272. doi: 10.15949/j.cnki.0371- 0025.2016.02.015.
|
|
WU Di, TAO Rui, and ZHANG Xiaojun, et al. Perception auditory scene analysis for speaker recognition[J]. Acta Acustica, 2016, 41(2): 260-272. doi: 10.15949/j.cnki.0371- 0025.2016.02.015.
|
[5] |
杨立东, 王晶, 谢湘, 等. 基于低秩张量补全的多声道音频信号恢复方法[J]. 电子与信息学报, 2016, 38(2): 394-399. doi: 10.11999/JEIT150589.
|
|
YANG Lidong, WANG Jing, and XIE Xiang, et al. Low rank tensor completion for recovering missing data in multi-channel audio signal[J]. Journal of Electronics & Information Technology, 2016, 38(2): 394-399. doi: 10.11999 /JEIT150589.
|
[6] |
JANG G J, LEE T W, and OH Y H. Single-channel signal separation using time-domain basis functions[J]. IEEE Signal Processing Letters, 2003, 10(6): 168-171. doi: 10.1109/LSP. 2003.811630.
|
[7] |
王钢, 孙斌. 盲信号分离技术及算法研究[J]. 航天电子对抗, 2015, 31(4): 53-56. doi: 10.16328/j.htdz8511.2015.04.015.
|
|
WANG Gang and SUN Bin. Research on blind signal separation technology and algorithm[J]. Aerospace Electronic Warfare, 2015, 31(4): 53-56. doi: 10.16328/j.htdz8511.2015. 04.015.
|
[8] |
SCHMIDT M N and OLSSON R K. Single-channel speech separation using sparse non-negative matrix factorization[C]. ISCA International Conference on Spoken Language Proceesing, (INTERSPEECH), Pittsburgh, Pennsylvania, 2006: 2614-2617.
|
[9] |
KING B J and ATLAS L. Single-channel source separation using complex matrix factorization[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(8): 2591-2597. doi: 10.1109/TASL.2011.2156786.
|
[10] |
GRAIS E M and ERDOGAN H. Single channel speech music separation using nonnegative matrix factorization with sliding window and spectral masks[C]. Annual Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011: 1773-1776.
|
[11] |
GRAIS E M and ERDOGAN H. Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation[C]. INTERSPEECH, Lyon, France, 2013: 808-812.
|
[12] |
WENINGER F, LE Roux J, HERSHEY J R, et al. Discriminative NMF and its application to single-channel source separation[C]. Annual Conference of the International Speech Communication Association (INTERSPEECH), Singapore, 2014: 865-869.
|
[13] |
BAO G, XU Y, and YE Z. Learning a discriminative dictionary for single-channel speech separation[J]. IEEE/ ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(7): 1130-1138. doi: 10.1109/TASLP. 2014.2320575.
|
[14] |
WANG Z and SHA F. Discriminative non-negative matrix factorization for single-channel speech separation[C]. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 3749-3753. doi: 10.1109/ICASSP.2014.6854302.
|
[15] |
张春梅, 尹忠科, 肖明霞. 基于冗余字典的信号超完备表示与稀疏分解[J]. 科学通报, 2006, 51(6): 628-633.
|
|
ZHANG Chunmei, YIN Zhongke, and XIAO Mingxia. Signal over-complete representation and sparse decomposition based on redundant dictionary[J]. Chinese Science Bulletin, 2006, 51(6): 628-633.
|
[16] |
AHARON M, ELAD M, and BRUCKSTEIN A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. doi: 10.1109/TSP.2006.881199.
|
[17] |
COOKE M, BARKER J, CUNNINGHAM S, et al. An audio-visual corpus for speech perception and automatic speech recognition[J]. The Journal of the Acoustical Society of America, 2006, 120(5): 2421-2424. doi: 10.1121/1.2229005.
|
[18] |
VINCENT E, GRIBONVAL R, and FEVOTTE C. Performance measurement in blind audio source separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1462-1469. doi: 10.1109/TSA.2005. 858005.
|
[19] |
THOMAS S, SAON G, KUO H, et al. The IBM BOLT speech transcription system[C]. Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015: 3150-3153.
|
[20] |
NORRIS D, MCQUEEN J M, and CUTLER A. Prediction, Bayesian inference and feedback in speech recognition[J]. Language, Cognition and Neuroscience, 2016, 31(1): 4-18. doi: 10.1080/23273798.2015.1081703.
|
|
|
|