|国家科技期刊平台
首页|期刊导航|数据采集与处理|利用互子带滤波器和稀疏特性的多通道线性预测语音去混响方法

利用互子带滤波器和稀疏特性的多通道线性预测语音去混响方法OA北大核心CSTPCD

Multi-channel Linear Prediction for Speech Dereverberation Using Cross-Band Filters and Sparse Priors

中文摘要英文摘要

多通道线性预测是最为流行的语音去混响方法之一,现有相关研究大多利用子带谱减模型在每一个频带独立地获取期望信号,但这忽略了不同子带之间的相互影响.本文提出一种利用互子带谱减模型的多通道线性预测语音去混响方法.相比于大多数方法采用的子带谱减模型,本文方法采用的互子带谱减模型能够利用互子带滤波器来对不同子带之间的相互影响进行建模.本文方法利用复广义高斯分布建模期望信号,相比于常用的高斯分布,复广义高斯分布能够通过调整形状参数来描述语音信号的稀疏特性.在最大似然估计框架下,将语音去混响转化为关于互子带滤波器和子带滤波器的优化问题;并且基于替代最小化方法推导了保证收敛的优化算法.在不同混响时间、不同通道、不同声源和传声器距离情况下的一系列语音去混响实验验证了本文方法的性能显著优于传统去混响算法.

The multi-channel linear prediction(MCLP)is one of the most popular speech dereverberation methods.The band-to-band spectral subtraction model has been adopted by most existing studies to obtain the desired speech signal in each frequency band,but it neglects the interaction between different frequencies.This paper proposes a MCLP-based speech dereverberation method using the cross-band spectral subtraction model instead of the widely adopted band-to-band spectral subtraction model.The proposed model employs cross-band filters to account for the interactions between different frequencies.We model the desired signal using the complex generalized Gaussian(CGG)distribution.Compared with the Gaussian distribution,the CGG distribution can capture the sparse nature of speech signals using a suitable shape parameter.Within the maximum likelihood estimation framework,the speech dereverberation problem is formulated as an optimization problem involving the band-to-band and cross-band filters.An optimization algorithm with guaranteed convergence is derived based on the majorization-minimization method.A series of speech dereverberation experiments under various reverberation times,different channel numbers and different source-to-microphone distances demonstrate that the proposed method significantly outperforms traditional methods in terms of dereverberation performance.

康瑶;康坊;杨飞然

国家开放大学数字化部,北京 100039奥卢大学机器视觉与信号分析中心,奥卢 90570中国科学院噪声与振动重点实验室(声学研究所),北京 100190

电子信息工程

语音去混响多通道线性预测互子带滤波器复广义高斯分布替代最小化

speech dereverberationmulti-channel linear predictioncross-band filtercomplex generalized Gaussian distributionmajorization minimization

《数据采集与处理》 2024 (005)

1135-1146 / 12

国家自然科学基金面上项目(62171438);北京市自然科学基金(4242013);中国科学院声学研究所自主部署"前沿探索"类项目(QYTS202111);2023年度国家开放大学重点科研项目(Z23C0007).

10.16337/j.1004-9037.2024.05.007

评论