| 注册
首页|期刊导航|计算机应用研究|基于多尺度距离矩阵的语音关键词检测与细粒度定位方法

基于多尺度距离矩阵的语音关键词检测与细粒度定位方法

李祥瑞 毛启容

计算机应用研究2024,Vol.41Issue(11):3370-3375,6.
计算机应用研究2024,Vol.41Issue(11):3370-3375,6.DOI:10.19734/j.issn.1001-3695.2024.03.0097

基于多尺度距离矩阵的语音关键词检测与细粒度定位方法

Spoken term detection and fine-grained localization method based on multi-scale distance matrices

李祥瑞 1毛启容2

作者信息

  • 1. 江苏大学计算机科学与通信工程学院,江苏镇江 212013
  • 2. 江苏大学计算机科学与通信工程学院,江苏镇江 212013||江苏省大数据泛在感知与智能农业应用工程研究中心,江苏镇江 212013
  • 折叠

摘要

Abstract

Aiming to address the low localization accuracy of existing spoken term detection methods,this paper proposed a spoken term detection and fine-grained localization method based on multi-scale distance matrices(MF-STD).This method firstly employed a residual convolutional network to extract features and construct a distance matrix to model the correlation be-tween inputs.Then,it learnt the localization information at different scales through multi-scale segmentation and decoupling heads.Finally,the model was optimized according to the multi-scale weighted localization loss,confidence loss,and classifi-cation loss.This enabled the model to achieve fine-grained prediction of keyword existence and time domain boundaries.Experimental results on the LibriSpeech dataset demonstrate that for in-vocabulary detection,the precision and intersection over union(IoU)reach 97.1%and 88.6%,respectively.In the case of out-of-vocabulary detection,the precision and IoU reach 96.7%and 88.2%,respectively.In comparison to existing methods for spoken term detection and localization,MF-STD significantly improves detection accuracy and localization precision.This fully demonstrates the superiority of the proposed method and the effectiveness of multi-scale feature modeling and fine-grained localization constraints in spoken term detection tasks.

关键词

语音关键词检测/语音细粒度定位/多尺度检测/残差卷积网络

Key words

spoken term detection/speech fine-grained localization/multi-scale detection/convolutional residual network

分类

信息技术与安全科学

引用本文复制引用

李祥瑞,毛启容..基于多尺度距离矩阵的语音关键词检测与细粒度定位方法[J].计算机应用研究,2024,41(11):3370-3375,6.

基金项目

江苏省重点研发计划资助项目(BE2020036) (BE2020036)

江苏大学应急管理学院专项科研项目(KY-A-01) (KY-A-01)

计算机应用研究

OA北大核心CSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文