| 注册
首页|期刊导航|计算机工程与应用|改进Res2Net的多尺度端到端说话人识别系统

改进Res2Net的多尺度端到端说话人识别系统

邓力洪 邓飞 张葛祥 杨强

计算机工程与应用2023,Vol.59Issue(24):110-120,11.
计算机工程与应用2023,Vol.59Issue(24):110-120,11.DOI:10.3778/j.issn.1002-8331.2208-0085

改进Res2Net的多尺度端到端说话人识别系统

Multi-Scale End-to-End Speaker Recognition System Based on Improved Res2Net

邓力洪 1邓飞 1张葛祥 2杨强3

作者信息

  • 1. 成都理工大学 计算机与网络安全学院(牛津布鲁克斯学院),成都 610059
  • 2. 成都理工大学 人工智能研究中心,成都 610059||成都信息工程大学 控制工程学院,成都 610059
  • 3. 成都信息工程大学 控制工程学院,成都 610059
  • 折叠

摘要

Abstract

The feature extraction ability of lightweight convolutional neural networks in speaker recognition systems is weak and recognition is poor.And to improve the feature extraction ability,many methods use deeper,wider and more complex network structures,which make the number of parameters and inference time increase exponentially.This paper introduces Res2Net in target detection task to speaker recognition task,and verifies its effectiveness and robustness in speaker recognition task.And FullRes2Net is improved and proposed to have stronger multi-scale feature extraction capa-bility without increasing the number of parameters,and 17%performance improvement compared to Res2Net.Mean-while,in order to solve the problems of existing attention methods improve the shortcomings of convolution itself and fur-ther enhance the feature extraction ability of convolutional neural networks,mixed time-frequency channel attention is proposed.Experiments are conducted on the Voxceleb dataset,and the results show that the proposed method effectively improves the feature extraction ability and generalization ability of the system,with a 34%performance improvement compared to Res2Net,and outperforms advanced speaker recognition systems using complex structures,which is an end-to-end structure with fewer parameters and higher efficiency,suitable for applications in realistic scenarios.

关键词

说话人识别/端到端/注意力机制

Key words

speaker recognition/end-to-end/attention mechanisms

分类

信息技术与安全科学

引用本文复制引用

邓力洪,邓飞,张葛祥,杨强..改进Res2Net的多尺度端到端说话人识别系统[J].计算机工程与应用,2023,59(24):110-120,11.

基金项目

国家自然科学基金(61972324) (61972324)

四川省科技计划(2021YFS0313,2021YFG0133). (2021YFS0313,2021YFG0133)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量4
|
下载量0
段落导航相关论文