首页|期刊导航|电子科技|基于对数处理机制和时频掩蔽估计的语音增强

基于对数处理机制和时频掩蔽估计的语音增强

王显云窦姗姗程楚皓

电子科技2025，Vol.38Issue(1)：45-51,72,8.

电子科技2025，Vol.38Issue(1)：45-51,72,8.DOI:10.16180/j.cnki.issn1007-7820.2025.01.007

基于对数处理机制和时频掩蔽估计的语音增强

Speech Enhancement Based on the Logarithmic Processing and Time Frequency Masking Estimation

王显云 ¹窦姗姗 ¹程楚皓¹

作者信息

1. 中国电子科技集团公司第三研究所声学部,北京 100015
折叠

摘要

Abstract

In view of the problem of inaccurate speech estimation by time-spectrum model,this study proposes a model transformation method to obtain the logarithmic probability density function of noise and speech.With the help of the logarithmic relationship among noisy speech,clean speech and noise,and the MMSE(Minimum Mean Square Error)estimation theory,a time-frequency mask of the estimating log-spectrum of speech is derived.A soft mask is also derived based on the logarithmic probability distribution of speech and noise,which can weight the logarithmic subbands of noisy speech to reduce noise and improve the accuracy of speech estimation.The simulation results show that compared with unprocessed noisy speech,the proposed method has an improvement of more than 3 dB in noise suppression.The average improvement in auditory perception of time-frequency mask and soft mask based on MMSE is 27.7%and 29.4%,and the average improvement in intelligibility is 12.7%and 14.3%,re-spectively.

关键词

语音处理/语音增强/对数概率密度函数/时频掩模/噪声抑制/非监督学习/软掩蔽/对数谱

Key words

speech processing/speech enhancement/logarithmic probability density function/time-frequency mask/noise suppression/unsupervised learning/soft mask/logarithmic spectrum

分类

信息技术与安全科学

引用本文复制引用

王显云,窦姗姗,程楚皓..基于对数处理机制和时频掩蔽估计的语音增强[J].电子科技,2025,38(1):45-51,72,8.

基金项目

近地面探测技术重点实验室基金(6142414210405)Science and Technology on Near-Surface Detection Laboratory(6142414210405) （6142414210405）

电子科技

ISSN：1007-7820

访问量0

下载量0

段落导航