信号处理2025,Vol.41Issue(9):1581-1590,10.DOI:10.12466/xhcl.2025.09.011
基于生成对抗网络的到达时间差估计器
Time Difference of Arrival Estimator Based on Generative Adversarial Networks
摘要
Abstract
The time difference of arrival(TDOA)is a crucial acoustic spatial characteristic that is widely employed in multichannel audio signal processing applications.Traditional TDOA estimators,such as the generalized cross-correlation with phase transform(GCC-PHAT)method,exhibit superior performance under ideal acoustic conditions.However,their accuracy deteriorates significantly under low signal-to-noise ratio(SNR)and strong reverberation condi-tions.Recent advances in deep learning have spurred the development of data-driven TDOA estimators with high estima-tion accuracy but limited robustness under severe noise and high reverberation conditions.To address these limitations,this paper proposes a generative adversarial network(GAN)-based TDOA estimator that enhances the robustness of models in low-SNR and highly reverberant environments through adversarial training mechanisms.This study is the first to propose a GAN-based TDOA estimation framework that significantly improves model generalization via adversarial training between the generator and the discriminator.The generator employs gated recurrent units(GRUs)for dimen-sional expansion of raw audio signals and extracts GCC-PHAT-based cross-correlation features to enhance the model's sensitivity to time-delay information.The convolutional neural network-based discriminator utilizes multilayer convolu-tional structures to extract high-dimensional features,which are then fused with either the ground-truth or predicted TDOA values to obtain confidence scores.The generator is optimized using a joint loss function that combines cross-entropy and adversarial losses,while the discriminator shows enhanced discrimination capability for both real and gener-ated TDOA estimates.This design incorporates principles from Wasserstein GANs(WGANs)by integrating the dis-criminator's output confidence scores into the generator's loss function.This approach not only substantially stabilizes model training but also effectively resolves mode collapse issues,and thus,the corresponding performance surpasses the performance boundaries of conventional single-loss-function training schemes.To validate the effectiveness of the proposed method,we conducted comparative experiments on public datasets and thus compared the performance of the proposed framework with those of the classical GCC-PHAT method and state-of-the-art deep learning-based TDOA esti-mators.The experimental results demonstrate that our method achieves superior performance in acoustic environments characterized by low SNRs and strong reverberation.Thus,it statistically outperforms all baseline methods in terms of TDOA estimation accuracy.关键词
到达时间差/声源定位/生成对抗网络Key words
time difference of arrival/sound source localization/generative adversarial networks分类
信息技术与安全科学引用本文复制引用
代浩阳,呼德..基于生成对抗网络的到达时间差估计器[J].信号处理,2025,41(9):1581-1590,10.基金项目
国家自然科学基金(62201297,62361045) The National Natural Science Foundation of China(62201297,62361045) (62201297,62361045)