首页|期刊导航|软件导刊|基于注意力机制的时频域语音增强模型

基于注意力机制的时频域语音增强模型

林攀何儒汉

软件导刊2024，Vol.23Issue(1)：63-68,6.

软件导刊2024，Vol.23Issue(1)：63-68,6.DOI:10.11907/rjdk.222325

基于注意力机制的时频域语音增强模型

An Attention-Based Model for Time-Domain Frequency Speech Enhancement

林攀 ¹何儒汉¹

作者信息

1. 武汉纺织大学计算机与人工智能学院||湖北省服装信息化工程技术研究中心,湖北武汉 430200
折叠

摘要

Abstract

To address the phase mismatch problem of single-channel speech enhancement techniques in the frequency domain,a joint time-domain and frequency-domain speech enhancement algorithm is proposed to jointly optimise the learning targets in different domains during the training phase.An attention mechanism is added to simulate human auditory perceptual characteristics to enhance the model's ability to suppress noisy signals.It also uses expanded convolution to widen the perceptual field,enabling the fusion of more input layer information and the effective extraction of local features in the time and frequency domains.To enhance the speech enhancement performance,the joint time-domain and frequency-domain loss functions are optimised for different domain learning.To validate the effectiveness of the proposed method,extensive experiments are conducted on the dataset VoiceBank using residual time convolution as the baseline model,and the experimental structure shows better enhancement than using a single baseline model in the time or frequency domain.The perceptual speech quality(PESQ)after denoising was 3.06 and the signal distortion ratio(SI-SDR)was 20.00.

关键词

语音增强/时域/频域/时间卷积/注意力模块

Key words

speech enhancement/time domain/frequency domain/TCN/attention module

分类

信息技术与安全科学

引用本文复制引用

林攀,何儒汉..基于注意力机制的时频域语音增强模型[J].软件导刊,2024,23(1):63-68,6.

软件导刊

ISSN：1672-7800

访问量0

下载量0

段落导航