| 注册
首页|期刊导航|华南理工大学学报(自然科学版)|基于减小高频混响和RF-DRSN-EMA的声音事件分类方法

基于减小高频混响和RF-DRSN-EMA的声音事件分类方法

曹毅 王彦雯 李杰 郑植 孙浩

华南理工大学学报(自然科学版)2025,Vol.53Issue(7):70-79,10.
华南理工大学学报(自然科学版)2025,Vol.53Issue(7):70-79,10.DOI:10.12141/j.issn.1000-565X.240508

基于减小高频混响和RF-DRSN-EMA的声音事件分类方法

Acoustic Scene Classification Method Based on Reducing High-Frequency Reverberation and RF-DRSN-EMA

曹毅 1王彦雯 1李杰 1郑植 1孙浩1

作者信息

  • 1. 江南大学 机械工程学院/江苏省食品先进制造装备技术重点实验室,江苏 无锡 214122
  • 折叠

摘要

Abstract

To address the issues of low classification accuracy and poor generalization in existing acoustic scene classification methods,this paper proposed a novel acoustic scene classification method based on reducing high-frequency reverberation and a frequency-domain residual shrinkage network with multi-scale attention,named RF-DRSN-EMA.Firstly,according to the principle of reducing sound reverberation,this paper introduced a redu-cing high-frequency reverberation method.This method attenuated only the high-frequency reverberation while preserving essential frequency information in other bands.As a result,speech intelligibility was enhanced,and the impact of speech distortion was minimized.Secondly,based on the deep residual shrinkage network,the proposed RF-DRSN-EMA integrates an improved frequency-domain self-calibration mechanism and a multi-scale attention module.The network used RF self-calibration module with a long-short residual structure to mitigate feature collapse,enabling efficient extraction of frequency-domain information.A multi-scale attention module was then applied at the output of each unit to highlight relevant information,further enhancing the model's representation capacity.Finally,the proposed method is evaluated on three benchmark datasets:ESC-10,UrbanSound8K,and DCASE2020 Task 1A.The results show that the proposed high-frequency reverberation reduction method effectively suppresses high-frequency reverberation and background noise while eliminating redundant features,resulting in minimal speech quality degradation.The RF-DRSN-EMA network achieves efficient frequency-domain denoising and feature extraction,reaching classification accuracies of 98.00%,93.42%,and 72.80%on the three datasets,respectively.These results confirm the effectiveness and generalizability of the proposed method.

关键词

声音事件分类/减小高频混响/频域残差收缩网络/多尺度注意力/语音增强

Key words

acoustic scene classification/reducing high-frequency reverberation/frequency-domain residual shrinkage network/multi-scale attention/speech enhancement

分类

信息技术与安全科学

引用本文复制引用

曹毅,王彦雯,李杰,郑植,孙浩..基于减小高频混响和RF-DRSN-EMA的声音事件分类方法[J].华南理工大学学报(自然科学版),2025,53(7):70-79,10.

基金项目

国家自然科学基金项目(52175234) (52175234)

高等学校学科创新引智计划项目(B18027)Supported by the National Natural Science Foundation of China(52175234)and the Programme of Introducing Talents of Discipline to Universities(B18027) (B18027)

华南理工大学学报(自然科学版)

OA北大核心

1000-565X

访问量0
|
下载量0
段落导航相关论文