现代电子技术2025,Vol.48Issue(19):115-121,7.DOI:10.16652/j.issn.1004-373x.2025.19.019
基于双重随机增强与分层Transformer的城市环境声检测方法
Urban environmental sound detection based on dual random augmentation and hierarchical transformer
摘要
Abstract
The complexity and diversity of urban acoustic scenes pose challenges to the traditional sound recognition methods,highlighting the need to balance detection capability and computational complexity.In view of the above,this paper proposes a novel urban environmental sound detection method,which aims to bolster the model's ability of urban environmental sound classification and mitigate its reliance on computational resources.Firstly,a dual random combination data enhancement strategy is introduced to generate diverse audio samples by randomly combining different enhancement technologies,so as to enrich training data and enhance the generalization ability of the model.Next,an innovative hierarchical audio Transformer that incorporates a window attention mechanism and a coupled simple attention semantic module is proposed.This updating strategy elevates sound classification performance effectively.Experimental results indicate that the proposed method only requires 32%of the parameters and 15%of the training time of the previous Transformer,and that the method achieves an accuracy of 91.2%on UrbanSound8K,a mean average precision(mAP)of 0.476 on AudioSet,and an accuracy of 97.2%on ESC-50,which improves urban environmental sound detection performance significantly.关键词
城市环境声检测/声音分类/深度学习/Transformer/数据增强/注意力机制Key words
urban environmental sound detection/sound classification/deep learning/Transformer/data augmentation/attention mechanism分类
信息技术与安全科学引用本文复制引用
付予哲,王玫,阚瑞祥,仇洪冰..基于双重随机增强与分层Transformer的城市环境声检测方法[J].现代电子技术,2025,48(19):115-121,7.基金项目
国家自然科学基金项目(62071135) (62071135)
国家自然科学基金项目(61961010) (61961010)
桂林电子科技大学研究生创新项目(2023YCXB05) (2023YCXB05)
广西科技重大专项(桂科AA23062035) (桂科AA23062035)