数据采集与处理2017,Vol.32Issue(1):54-61,8.DOI:10.16337/j.1004-9037.2017.01.006
基于变分贝叶斯改进的说话人聚类算法
Improved Algorithm of Speaker Clustering Based on Variation Bayesian
摘要
Abstract
The speaker clustering is an important process of speaker diarization,yet traditional method for hierarchical agglomerative clustering (HAC) with distance measurement based on Bayesian information criterion (BIC) can lead to the clustering error propagation.To solve this problem,step by step algorithm is proposed,when the minimum BIC distance between segments exceeds a predefined threshold,or the number of the categories on hierarchical clustering reaches a certain number.The current clustering result as the initial class center,and then variational Bayesian method will be exploited to tune the speaker segments among the categories iteratively.Finally,the number of speaker is determined according to the probabilistic linear discriminant analysis (PLDA) score threshold.Experiments on national institute of standards and technology (NIST) 08 summed test set show that this method improves the "class purity" and "speaker purity" compared with conventional algorithms.Moreover,performance of speaker diarization is relatively improved by 27.6 %.关键词
说话人聚类/贝叶斯信息准则/概率线性判别分析/变分贝叶斯Key words
speaker clustering/Bayesian information criterion/probabilistic linear discriminant analysis/variational Bayesian分类
信息技术与安全科学引用本文复制引用
李敬阳,李锐,王莉,王晓笛..基于变分贝叶斯改进的说话人聚类算法[J].数据采集与处理,2017,32(1):54-61,8.基金项目
公安部应用创新计划(2014YYCXGAES048)资助项目. (2014YYCXGAES048)