计算机与现代化Issue(3):22-28,37,8.DOI:10.3969/j.issn.1006-2475.2025.03.004
基于并行级联时频Conformer生成对抗网络的语音增强算法
Speech Enhancement Algorithm Based on Parallel Cascaded Time-frequency Conformer Generative Adversarial Network
摘要
Abstract
Generative adversarial networks continuously improve network mapping capabilities through the adversarial training mechanism,giving them powerful noise reduction capabilities and are widely used in the field of speech enhancement.In order to solve the problem that the existing generative adversarial network speech enhancement methods do not fully utilize the time-frequency correlation and global correlation in the speech feature sequence and have poor denoising performance,this paper pro-poses a parallel cascaded time-frequency Conformer generative adversarial network for single channel speech enhancement.Firstly,the parallel cascaded time-frequency Conformer models the sequential features of time and frequency in the speech spec-trogram,extracting local and global solicitations in the time domain and frequency domain for generator learning.Then,the two Decoder paths are used to learn the speech spectrogram with the amplitude mask of the noisy speech and the spectrogram of the clean speech respectively to fuse the output of the two paths to obtain the generated speech.Finally,an indicator discriminator is used to evaluate the relevant evaluation index scores of the speech generated by the generator,and the generator generation is im-proved through adversarial training.The quality of the voice is verified on the public dataset VoiceBank+Demand.关键词
语音增强/生成对抗网络/时频Conformer/指标判别器/对抗训练Key words
speech enhancement/generative adversarial network/time-frequency Conformer/indicator discriminator/adver-sarial training分类
信息技术与安全科学引用本文复制引用
王泽宇,韩建宁,郝国栋,杨润..基于并行级联时频Conformer生成对抗网络的语音增强算法[J].计算机与现代化,2025,(3):22-28,37,8.基金项目
山西省回国留学人员科研资助项目(2023-127) (2023-127)
山西省自然科学基金面上项目(202103021224201) (202103021224201)