湖北汽车工业学院学报2023,Vol.37Issue(4):42-47,6.DOI:10.3969/j.issn.1008-5483.2023.04.009
一种可学习的跨域鲁棒说话人识别方法
Learnable Cross-domain Robust Speaker Recognition Method
摘要
Abstract
A learnable cross-domain robust speaker recognition method was proposed.Based on the acoustic feature extractor of Mel frequency cepstral coefficient(MFCC),learnable per-channel energy normalization(PCEN)instead of logarithmic operation was introduced to obtain the acoustic feature ex-tractor of Mel-learnable-PCENs,and the ECAPA-TDNN neural network was jointly used to achieve automatic parameter optimization.The improved method was trained and tested on VoxCeleb1-dev,VoxCeleb-O,and VoxMovies public datasets.The results show that compared with MFCC-SV,Mel-learnable-PCENs-SV reduces the equal error rate and minimum detection cost by 8.35%and 15.23%respectively in VoxCeleb-O and reduces the equal error rate by 8.42%in VoxMovies.The effectiveness of Mel-learnable-PCENs was verified on the Sharing-VAN 2.0 onboard hardware platform,namely Jet-son AGX Xavier.关键词
说话人识别/车载/声学特征提取器/跨域/每通道能量归一化Key words
speaker recognition/onboard/acoustic feature extractor/cross-domain/PCEN分类
信息技术与安全科学引用本文复制引用
郑靓,张友兵,周奎,付瑞..一种可学习的跨域鲁棒说话人识别方法[J].湖北汽车工业学院学报,2023,37(4):42-47,6.基金项目
国家重点研发计划(2017YFB0102605) (2017YFB0102605)
湖北省技术创新专项对外科技合作类项目(2AHB060) (2AHB060)