现代信息科技2026,Vol.10Issue(1):41-46,6.DOI:10.19850/j.cnki.2096-4706.2026.01.009
图文跨模态检索双重过滤与动态补全的注意力区域优化方法研究
Research on Attention Region Optimization Method for Image-text Cross-modal Retrieval Based on Dual Filtering and Dynamic Completion
摘要
Abstract
Current image-text cross-modal retrieval has two main bottlenecks,including the fact that traditional Attention Mechanism often includes a large number of redundant regions and introduces irrelevant semantic noise,and that excessive screening leads to insufficient effective regions and causes the loss of key visual information.These two situations significantly reduce the matching accuracy and robustness of the model.To address this problem,this paper proposes a dual optimization strategy.It first adaptively retains high-response regions through a dual filtering mechanism to effectively suppress redundant noise.Meanwhile,it innovatively introduces a Top-K dynamic completion method to automatically supplement key semantic regions when feature deficiency is detected.Experimental verification shows that this method effectively avoids the loss of key information while maintaining the accuracy of feature selection.It significantly improves the cross-modal matching performance of the model in complex scenes.关键词
跨模态检索/图文检索/特征对齐/阈值过滤/注意力优化Key words
cross-modal retrieval/image-text retrieval/feature alignment/threshold filtering/attention optimization分类
信息技术与安全科学引用本文复制引用
孟凡奇,田凯迪,田研..图文跨模态检索双重过滤与动态补全的注意力区域优化方法研究[J].现代信息科技,2026,10(1):41-46,6.基金项目
吉林省自然科学基金项目(20230101242JC) (20230101242JC)