首页|期刊导航|桂林电子科技大学学报|基于场景中物体位置关系的图像描述方法

基于场景中物体位置关系的图像描述方法

杨璐钱艺文益民

桂林电子科技大学学报2024，Vol.44Issue(6)：560-567,8.

桂林电子科技大学学报2024，Vol.44Issue(6)：560-567,8.DOI:10.16725/j.1673-808X.202360

基于场景中物体位置关系的图像描述方法

Image description method based on object position relationship in scene

杨璐 ¹钱艺 ¹文益民¹

作者信息

1. 桂林电子科技大学计算机与信息安全学院,广西桂林 541004
折叠

摘要

Abstract

Image description aims to transform visual content into language description,which is an urgent and challenging multi-modal generation task.Due to the lack of attention to the implicit position information in the most image description methods,it is difficult to accurately describe the position relationship of the objects in the image.For solving this problem,the position relation-ship encoder-combine decoder(PRCO)structure is proposed,which focus on and generate the objects positional relationships.A novel position relationship-encoder get started with the object relationship scene graph using node features.Technically,common sense dictionary and reasoning module are created to calculate the degree of imbalance between objects,which are used to perform a secondary encoding of the object relationship nodes.Specifically,the combine-decoder is designed to process the encoded informa-tion,with an erasing module and bias gate to optimize the node features in the graph.Experiments are conducted on MSCOCO and Visual Genome Image description dataset,and superior results in comparing to state-of-the-art approaches.More remarkably,PRCO achieves an increases CIDEr performance on Visual Genome testing set.Our code is publicly available on Gitee:https://gitee.com/ymw12345/PRCO.

关键词

图像描述/图卷积网络/长短期记忆网络/位置关系编码器/联合解码器

Key words

image description/graph convolutional networks/long short-term memory/position relationship encoder/combine de-coder

分类

信息技术与安全科学

引用本文复制引用

杨璐,钱艺,文益民..基于场景中物体位置关系的图像描述方法[J].桂林电子科技大学学报,2024,44(6):560-567,8.

基金项目

广西重点研发计划(桂科AB21220023) （桂科AB21220023）

国家自然科学基金(61866007) （61866007）

广西图像图形与智能处理重点实验室基金(GIIP2005) （GIIP2005）

桂林电子科技大学学报

ISSN：1673-808X

访问量0

下载量0

段落导航