通信学报2026,Vol.47Issue(3):170-183,14.DOI:10.11959/j.issn.1000-436x.2026054
采用时域高斯协同模型的合成伪造语音检测方法
Synthetic spoofing speech detection method using temporal Gaussian synergistic model
摘要
Abstract
To mitigate the adverse impact of inconsistent silence distribution on synthetic spoofing speech detection per-formance and to improve detection accuracy and generalization capability,a temporal-Gaussian synergistic model(TGSM)for synthetic spoofing speech detection method was proposed.A unified Gaussian mixture model(GMM)was constructed to extract log Gaussian posterior(LGP)features from both bona fide and spoofed speech,thereby avoiding the information fragmentation and parameter redundancy caused by traditional independent modeling strategies and en-hancing the discriminability of the feature space.To address silence-related interference,a statistical energy-based thresh-old suppression mechanism was introduced,which adaptively suppressed low-energy speech segments while preserving potentially discriminative information.Furthermore,two-dimensional convolution was employed to jointly model the temporal-Gaussian component structure,followed by graph construction in both the temporal and component domains.A heterogeneous graph attention network was then used to learn cross-domain feature interactions.Experimental results on the ASVspoof 2021 dataset demonstrate that the proposed method reduces the equal error rate(EER)by 12.07%and the tandem detection cost function(t-DCF)by 12.14%compared with the baseline model,validating the effectiveness and generalization capability of the proposed method.关键词
合成伪造语音检测/对数高斯后验特征/统一建模/阈值抑制Key words
synthetic spoofing speech detection/log Gaussian posterior feature/uniformly constructing model/threshold screening分类
信息技术与安全科学引用本文复制引用
简志华,梁承涵,朱峰满..采用时域高斯协同模型的合成伪造语音检测方法[J].通信学报,2026,47(3):170-183,14.基金项目
国家自然科学基金资助项目(No.61772166) The National Natural Science Foundation of China(No.61772166) (No.61772166)