首页|期刊导航|桂林电子科技大学学报|基于预处理的DOA估计和基频双输入的语音分割

基于预处理的DOA估计和基频双输入的语音分割

王玫成家礼

桂林电子科技大学学报2024，Vol.44Issue(4)：348-354,7.

桂林电子科技大学学报2024，Vol.44Issue(4)：348-354,7.DOI:10.16725/j.1673-808X.202333

基于预处理的DOA估计和基频双输入的语音分割

Speech segmentation based on preprocessing DOA estimation and fundamental frequency dual input

王玫 ¹成家礼²

作者信息

1. 桂林电子科技大学认知无线电与信息处理省部共建教育部重点实验室,广西桂林 541004||桂林理工大学物理与电子信息工程学院,广西桂林 541006
2. 桂林电子科技大学认知无线电与信息处理省部共建教育部重点实验室,广西桂林 541004
折叠

摘要

Abstract

Speech segmentation is an important component of speech separation systems,which plays an important role in many ap-plications such as source estimation and automatic speech recognition in multi-speaker environments,multi-source target tracking,etc.Segmentation of overlapping speech has always been the focus of this work.In real life,the speech signals collected by micro-phones in rooms usually contain reverberation and noise signals,which deteriorate the speech quality of the received signals and af-fect the accuracy of the estimated features of the boda direction,leading to the degradation of the segmentation performance of multi-source overlapping speech.To address the problem that existing multi-source segmentation methods are poorly robust to noise and reverberant signals,a method is proposed to eliminate apparently abnormal noise and reverberant signals in speech signals by pre-processing.The method uses a combination of a generalized parametric phase canceller and a post-filter implemented with a Wiener filter to process the original speech signal,eliminating the reverberant and noisy signals,resulting in improved speech quality and,in turn,more accurate estimation of the direction of arrival features.The segmentation is then performed by tracking the speaker's fun-damental frequency features and direction of arrival features simultaneously using multi-hypothesis tracking.16 conference audios from the AMI corpus are statistically and analytically analyzed with multi-source overlapping speech,and the results show that the average hit rate(HIT)rate is improved by 2.10％compared with the method without pre-processing.

关键词

语音分割/广义旁瓣相消器/维纳滤波器/波达方向/多假设跟踪/基频

Key words

speech segmentation/generalized sidelobe canceller/Wiener filter/direction of arrival/multiple hypothesis tracking/fundamental frequency

分类

信息技术与安全科学

引用本文复制引用

王玫,成家礼..基于预处理的DOA估计和基频双输入的语音分割[J].桂林电子科技大学学报,2024,44(4):348-354,7.

基金项目

国家自然科学基金(62071135) （62071135）

广西自然科学基金(2019GXNSFBA245103) （2019GXNSFBA245103）

桂林电子科技大学研究生教育创新计划(2021YCXS037) （2021YCXS037）

桂林电子科技大学学报

ISSN：1673-808X

访问量0

下载量0

段落导航