| 注册
首页|期刊导航|计算机应用与软件|基于图注意力网络的视觉常识推理方法

基于图注意力网络的视觉常识推理方法

张文琪 高永超 钱恒 吕红丽

计算机应用与软件2025,Vol.42Issue(10):191-197,238,8.
计算机应用与软件2025,Vol.42Issue(10):191-197,238,8.DOI:10.3969/j.issn.1000-386x.2025.10.026

基于图注意力网络的视觉常识推理方法

A GRAPH ATTENTION NETWORK FOR VISUAL COMMONSENSE REASONING

张文琪 1高永超 2钱恒 2吕红丽3

作者信息

  • 1. 齐鲁工业大学计算机科学与技术学院 山东济南 250353
  • 2. 山东省计算中心(国家超级计算济南中心) 山东济南 250014
  • 3. 湖首大学电气与计算机工程学院 安大略桑德贝P7B 5E1
  • 折叠

摘要

Abstract

Visual commonsense reasoning(VCR)is a challenging multimodal task proposed in recent years.In order to reason the semantic relationship in images and improve the performance of the VCR task,a graph attention network for visual commonsense reasoning is proposed.The method encoded the visual objects for various images as visual nodes in the image and used the graph attention network to model the features of visual nodes and adjacent nodes to obtain the internal associations between the objects.In addition,the method effectively captured the dynamic interaction between visual objects and further improved the understanding of image semantics.Experiments on the VCR dataset show that the performance of the method on the three sub-tasks of VCR is improved.

关键词

视觉常识推理/多模态/图注意力网络/视觉关系

Key words

Visual commonsense reasoning/Multimodal/Graph attention network/Cognitive reasoning

分类

计算机与自动化

引用本文复制引用

张文琪,高永超,钱恒,吕红丽..基于图注意力网络的视觉常识推理方法[J].计算机应用与软件,2025,42(10):191-197,238,8.

基金项目

国家重点研发计划项目(2021YFF0601603). (2021YFF0601603)

计算机应用与软件

OA北大核心

1000-386X

访问量0
|
下载量0
段落导航相关论文