|
|
A New Voice and Noise Activity Detection Algorithm and Its Application to Dual Microphone Noise Suppression System for Handset |
ZHANG Luofei ZHANG Ming LI Chen |
(School of Physics and Technology, Nanjing Normal University, Nanjing 210000, China) |
|
|
Abstract Existing dual microphone Voice Activity Detection (VAD) algorithms use normally a fixed threshold. The fixed threshold can not provide an accurate VAD under various noise environments. In such case, it causes voice quality degradation, particularly in handset applications. This paper proposes a new VAD algorithm based on Neural Network (NN). Both sub-band power level difference and inter-microphone cross correlation are used as features. Then the NN based VAD is combined with the method of inter-microphone signal power ratio to get a new voice and noise activity detection algorithm. Furthermore, the algorithm is used into noise suppression in handset to avoid performance degradation caused by VAD misjudgment. Experimental results show that the proposed method provides better noise suppression performance and lower speech distortion compared to the existing method.
|
Received: 23 November 2015
Published: 31 May 2016
|
|
Fund: Program of Natural Science Research of Jiangsu Higher Education Institutions of China, Program of Science and Technology of Jiangsu (BE2014139) |
Corresponding Authors:
ZHANG Luofei
E-mail: lincover@126.com
|
|
|
|
[1] |
JEUB M, HERGLOTZ C, NELKE C M, et al. Noise reduction for dual-microphone mobile phones exploiting power level differences[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Kyoto, 2012: 1693-1696. doi: 10.1109/ICASSP.2012.6288223.
|
[2] |
XU Y, DU J, and DAI L R. A Regression approach to speech enhancement based on deep neural networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2015, 23(1): 7-19. doi: 10.1109/TASLP.2014.2364452.
|
[3] |
XU Y, DU J, and DAI L R. An experimental study on speech enhancement based on deep neural networks[J]. IEEE Signal Processing Letters, 2014, 21(1): 65-68. doi: 10.1109/LSP. 2013.2291240.
|
[4] |
WANG Y X, NARAYANAN A, and WANG D L. On training targets for supervised speech separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(12): 1849-1859. doi: 10.1109/TASLP.2014.2352935.
|
[5] |
王明合, 张二华, 唐振明, 等. 基于Fisher 线性判别分析的语音信号端点检测方法[J]. 电子与信息学报, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122.
|
|
WANG Minghe, ZHANG Erhua, TANG Zhenmin, et al. Voice activity detection based on Fisher linear discriminant analysis[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122.
|
[6] |
郭海燕, 李枭雄, 李拟珺. 基于基频状态和帧间相关性的单通道语音分离算法[J]. 东南大学学报(自然科学版), 2014, 44(6): 1100-1104.
|
|
GUO Haiyan, LI Xiaoxiong, and LI Nijun. Single-channel speech separation based on pitch state and interframe correlation[J]. Journal of Southeast University (Natural Science Edition), 2014, 44(6): 1100-1104.
|
[7] |
NELKE C, BEAUGEANT C, and VARY P. Dual microphone noise PSD estimation for mobile phones in hands-free position exploiting the coherence and speech presence probability[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, 2013: 7279-7283. doi: 10.1109/ ICASSP.2013.6639076.
|
[8] |
YOUSEFIAN N, RAHMANI M, and AKBARI A. Power level difference as a criterion for speech enhancement[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, 2009: 4653-4656. doi: dx.doi.org/ 10.1109/ICASSP.2009.4960668.
|
[9] |
YOUSEFIAN N, AKBARI A, and RAHMANI M. Using power level difference for near field dual-microphone speech enhancement[J]. Applied Acoustics, 2009, 70(11/12): 1412-1421.
|
[10] |
FU Z H, FAN F, and HUANG J D. Dual-microphone noise reduction for mobile phone application[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, 2013: 7239-7243. doi: 10.1109/ ICASSP.2013.6639068.
|
[11] |
MEYER-BAESE U. Digital Signal Processing with Field Programmable Gate Arrays[M]. Third Edition, Berlin Heidelberg: Springer, 2007: 298-305.
|
[12] |
RUBIO J E, ISHIZUKA K, SAWADA H, et al. Two- microphone voice activity detection based on the homogeneity of the direction of arrival estimates[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, 2007: 385-388. doi: 10.1109/ICASSP. 2007.366930.
|
[13] |
ZHAO H C, LI L G, and LI L H, et al. Dual-microphone adaptive noise canceller with a voice activity detector[C]. IEEE Region 10 Symposium, Kuala Lumpur, 2014: 551-554. doi: 10.1109/TENCONSpring.2014.6863095.
|
[14] |
CHOI J H and CHANG J H. Dual-microphone voice activity detection technique based on two-step power level difference ratio[J] IEEE Transactions on Audio, Speech and Language Processing, 2014. 22(6): 1069-1081.
|
[15] |
HU Y, and LOIZHOU P C. Evaluation of objective quality measures for speech enhancement[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(1): 229-238.
|
|
|
|