湖南大学学报(自然科学版)2023,Vol.50Issue(12):59-68,10.DOI:10.16339/j.cnki.hdxbzkb.2023298
基于邻域自适应注意力的跨域融合语音增强
Neighborhood Adaptive Attention Based Cross-domain Fusion Network for Speech Enhancement
摘要
Abstract
Deep learning(DL)based speech enhancement methods can be divided into time domain methods and frequency domain methods,each of which has its own pros.To make full use of the advantages of methods in both domains,a cross-domain speech enhancement model based on the neighborhood adaptive attention mechanism is proposed.The model enhances the input waveform and spectrum at the same time,and the final enhancement result is obtained by cross-domain fusion of the enhancement results in time domain and frequency domain.In order to take advantage of the information complementarity between the enhanced results in two domains,an information communication module is proposed to realize the information exchange between the enhanced results.In order to improve the feature extraction ability of the time-domain and the frequency-domain enhanced models,and to make full use of the signal characteristics of the two domains,the neighborhood adaptive attention module is proposed.The neighborhood adaptive attention module adaptively aggregates local self-attention with different neighborhood sizes according to the input information and then models the stationary features of different scales.The experimental results show that the complementary characteristics of waveform and spectrum can be effectively utilized to further improve the enhancement performance by adding the neighborhood adaptive attention module and cross-domain information exchange and fusion module.关键词
语音增强/自注意力/跨域融合Key words
speech enhancement/self-attention/cross-domain fusion分类
计算机与自动化引用本文复制引用
岳焕景,多文昕,杨敬钰..基于邻域自适应注意力的跨域融合语音增强[J].湖南大学学报(自然科学版),2023,50(12):59-68,10.基金项目
国家自然科学基金资助项目(62072331),National Natural Science Foundation of China(62072331) (62072331)