Abstract:This paper builds one speaker diarization and localization speech processing system based on Multiple Distance Microphone (MDM) for NIST evaluation, and proposes a modified clustering algorithm based on time delay estimation, which can decrease the complexity of speaker diarization and improve the correct rate under the guarantee of stable performance. A new time delay matrix structure is proposed, which can acquire multiple speakers’ direction angle. It is the real speech data collected under the standard session environment to validate the algorithms. The correct rate of proposed speaker diarization algorithm is similar with other speaker diarization system existed; Location algorithm direction angle error is less than 3°. The results show that under appropriate conditions, the MDM system can be a better input device applied to multiple dialogue scenes.
杨毅, 宋辉, 刘加. 基于NIST评测的说话人分类及定位技术研究[J]. 电子与信息学报, 2011, 33(5): 1234-1237.
Yang Yi, Song Hui, Liu Jia. Speaker Diarization and Localization Technology Research Based on NIST Evaluation. , 2011, 33(5): 1234-1237.