首页|期刊导航|计算机工程与应用|改进Res2Net的多尺度端到端说话人识别系统

改进Res2Net的多尺度端到端说话人识别系统

邓力洪邓飞张葛祥杨强

计算机工程与应用2023，Vol.59Issue(24)：110-120,11.

计算机工程与应用2023，Vol.59Issue(24)：110-120,11.DOI:10.3778/j.issn.1002-8331.2208-0085

改进Res2Net的多尺度端到端说话人识别系统

Multi-Scale End-to-End Speaker Recognition System Based on Improved Res2Net

邓力洪 ¹邓飞 ¹张葛祥 ²杨强³

作者信息

1. 成都理工大学计算机与网络安全学院(牛津布鲁克斯学院),成都 610059
2. 成都理工大学人工智能研究中心,成都 610059||成都信息工程大学控制工程学院,成都 610059
3. 成都信息工程大学控制工程学院,成都 610059
折叠

摘要

Abstract

The feature extraction ability of lightweight convolutional neural networks in speaker recognition systems is weak and recognition is poor.And to improve the feature extraction ability,many methods use deeper,wider and more complex network structures,which make the number of parameters and inference time increase exponentially.This paper introduces Res2Net in target detection task to speaker recognition task,and verifies its effectiveness and robustness in speaker recognition task.And FullRes2Net is improved and proposed to have stronger multi-scale feature extraction capa-bility without increasing the number of parameters,and 17%performance improvement compared to Res2Net.Mean-while,in order to solve the problems of existing attention methods improve the shortcomings of convolution itself and fur-ther enhance the feature extraction ability of convolutional neural networks,mixed time-frequency channel attention is proposed.Experiments are conducted on the Voxceleb dataset,and the results show that the proposed method effectively improves the feature extraction ability and generalization ability of the system,with a 34%performance improvement compared to Res2Net,and outperforms advanced speaker recognition systems using complex structures,which is an end-to-end structure with fewer parameters and higher efficiency,suitable for applications in realistic scenarios.

关键词

说话人识别/端到端/注意力机制

Key words

speaker recognition/end-to-end/attention mechanisms

分类

信息技术与安全科学

引用本文复制引用

邓力洪,邓飞,张葛祥,杨强..改进Res2Net的多尺度端到端说话人识别系统[J].计算机工程与应用,2023,59(24):110-120,11.

基金项目

国家自然科学基金(61972324) （61972324）

四川省科技计划(2021YFS0313,2021YFG0133). （2021YFS0313,2021YFG0133）

计算机工程与应用

OA北大核心CSCDCSTPCD

ISSN：1002-8331

访问量6

下载量0

段落导航