无线电工程2024,Vol.54Issue(8):1862-1870,9.DOI:10.3969/j.issn.1003-3106.2024.08.004
基于多组多分辨率特征和小波通道注意力的环境声音分类
Environmental Sound Classification Based on Multiple Groups of Multi-resolution Features and Wavelet Channel Attention
摘要
Abstract
For the problem of insufficient time-frequency dimension information reflected in audio feature extraction in current Environmental Sound Classification(ESC)methods,a classification method based on multiple groups of multi-resolution features and wavelet channel attention is proposed.Multiple groups of multi-resolution features are used as network inputs,and data augmentation is achieved in both time and frequency dimensions through multiple groups of filters and multiple frequency resolutions,while information complementarity is also achieved;In order to better measure the importance of each channel,a wavelet channel attention module is designed for one-dimensional audio image features.The Discrete Wavelet Transform(DWT)is used to effectively combine the low-frequency and high-frequency subbands of the signal to obtain channel scalars.The Gram-Schmidt orthogonalization method is used to diversify the information extracted by the network during the channel attention compression stage.The Long Short Term Memory(LSTM)network is utilized to store information for a long time and improve the long-term reliability of learning.The experimental results show that the classification accuracy of the ESC-10 and ESC-50 datasets reach 98.7%and 93.6%,respectively,achieving good results and providing a new research approach for audio feature processing.关键词
环境声音分类/多组多分辨率特征/小波通道注意力/长短期记忆网络Key words
ESC/multiple groups of multi-resolution features/wavelet channel attention/LSTM network分类
信息技术与安全科学引用本文复制引用
李军,王子壬,向彦伯,钮焱..基于多组多分辨率特征和小波通道注意力的环境声音分类[J].无线电工程,2024,54(8):1862-1870,9.基金项目
国基自然科学基金(61902116) (61902116)
湖北省省级教研项目(2020454)National Natural Science Foundation of China(61902116) (2020454)
Hubei Provincial Teaching and Research Project(2020454) (2020454)