|国家科技期刊平台
首页|期刊导航|计算机工程与应用|视频文本跨模态检索研究综述

视频文本跨模态检索研究综述OA北大核心CSTPCD

Survey on Video-Text Cross-Modal Retrieval

中文摘要英文摘要

模态代表着数据特定的存在形式,不同模态数据的快速增长,使得多模态学习受到广泛关注.跨模态检索作为多模态学习的一个重要分支,在图文方面已得到显著发展.然而视频相对于图像而言承载了更多模态的数据,也包含更广泛的信息,能够满足用户对信息检索全面性、灵活性的要求,近年来逐渐成为跨模态检索的研究热点.为全面认识和理解视频文本跨模态检索及其前沿工作,对现有代表性方法进行了梳理和综述.首先归纳分析了当前基于深度学习的单向、双向视频文本跨模态检索方法,对每类方法中的经典工作进行了详细分析并阐述了优缺点.接着从实验的角度给出视频文本跨模态检索的基准数据集和评价指标,并在多个常用基准数据集上比较了一些典型方法的性能.最后讨论了视频文本跨模态检索的应用前景、待解决问题及未来研究挑战.

Modalities define the specific forms in which data exist.The swift expansion of various modal data types has brought multimodal learning into the limelight.As a crucial subset of this field,cross-modal retrieval has achieved noteworthy advancements,particularly in integrating images and text.However,videos,as opposed to images,encapsulate a richer array of modal data and offer a more extensive spectrum of information.This rich-ness aligns well with the growing user demand for comprehensive and adaptable information retrieval solutions.Consequently,video-text cross-modal retrieval has emerged as a burgeoning area of research in recent times.To thor-oughly comprehend video-text cross-modal retrieval and its state-of-the-art developments,a methodical review and summarization of the existing representative methods is conducted.Initially,the focus is on analyzing current deep learning-based unidirectional and bidirectional video-text cross-modal retrieval methods.This analysis includes an in-depth exploration of seminal works within each category,highlighting their strengths and weaknesses.Subse-quently,the discussion shifts to an experimental viewpoint,introducing benchmark datasets and evaluation met-rics specific to video-text cross-modal retrieval.The performance of several standard methods in benchmark data-sets is compared.Finally,the application prospects and future research challenges of video-text cross-modal retrieval are discussed.

陈磊;习怡萌;刘立波

宁夏大学 信息工程学院,银川 750021

计算机与自动化

多模态跨模态检索深度学习特征提取

multi-modalitycross-modal retrievaldeep learningfeature extraction

《计算机工程与应用》 2024 (004)

1-20 / 20

国家自然科学基金(62262053);宁夏科技创新领军人才资助项目(2022GKLRLX03);宁夏大学研究生创新项目(CXXM202357).

10.3778/j.issn.1002-8331.2306-0382

评论