| 注册
首页|期刊导航|计算机与现代化|用于知识视觉问答的问题增强知识检索网络

用于知识视觉问答的问题增强知识检索网络

赵永超 杨振国

计算机与现代化Issue(2):32-38,7.
计算机与现代化Issue(2):32-38,7.DOI:10.3969/j.issn.1006-2475.2026.02.004

用于知识视觉问答的问题增强知识检索网络

Question-augmented Knowledge Retrieval for Knowledge-based Visual Question Answering

赵永超 1杨振国1

作者信息

  • 1. 广东工业大学计算机学院,广东 广州 510006
  • 折叠

摘要

Abstract

Knowledge-Based Visual Question Answering(KB-VQA)requires answering questions with external knowledge in addition to the content of images.Nowadays,many works transform everything into the textual space for retrieving knowledge by textual space paradigm retriever,but there are two major limitations in textual space paradigm retriever for KB-VQA:1)The query obtained via image-to-text transformation can be inaccurate and redundant due to the absence of the question;2)Rel-evance between queries and supporting knowledge are computed with their semantic similarity,which can be insufficient to ques-tion answering.To this end,this paper proposes a Question-augmented Knowledge Retrieval Network(QKRN)for knowledge-based visual question answering,which consists of Question-augmented Query Construction(QQC)and Reverse Inference-based Re-ranking Retriever(RIR)modules.More specifically,the QQC module utilizes the cross-attention mechanism to local-ize question-related visual regions and construct question-augmented queries.Furthermore,the RIR module re-ranks the re-trieved knowledge by computing the likelihood of question generation conditioned on the knowledge.Extensive experiments con-ducted on OK-VQA and FVQA datasets verify the outperformance of the proposed QKRN.

关键词

人工智能/神经网络模型/深度学习/基于外部知识的视觉问答

Key words

artificial intelligence/neural network models/deep learning/knowledge-based VQA

分类

信息技术与安全科学

引用本文复制引用

赵永超,杨振国..用于知识视觉问答的问题增强知识检索网络[J].计算机与现代化,2026,(2):32-38,7.

基金项目

广东省自然科学基金面上项目(2024A1515010237) (2024A1515010237)

计算机与现代化

1006-2475

访问量0
|
下载量0
段落导航相关论文