| 注册
首页|期刊导航|计算机科学与探索|采用通道像素注意力的多标签图像识别

采用通道像素注意力的多标签图像识别

叶庆文 张秋菊

计算机科学与探索2024,Vol.18Issue(8):2109-2117,9.
计算机科学与探索2024,Vol.18Issue(8):2109-2117,9.DOI:10.3778/j.issn.1673-9418.2307087

采用通道像素注意力的多标签图像识别

Multi-label Image Recognition Using Channel Pixel Attention

叶庆文 1张秋菊1

作者信息

  • 1. 江南大学 机械工程学院,江苏 无锡 214122||江苏省食品先进制造装备与技术重点实验室,江苏 无锡 214122
  • 折叠

摘要

Abstract

Multi-label image recognition is the classification of images that contain labels for multiple object catego-ries.In order to solve the problems of small object recognition and sample data imbalance in multi-label image rec-ognition,this paper proposes simple and efficient channel pixel attention(CPA)and class weight cross-entropy loss,respectively.CPA generates the corresponding pixel features for each channel by calculating channel attention and pixel attention score,so as to improve the attention of the network to small objects,and input the pooled and gained pixel features to the multi-layer perceptron for final classification.The positive sample size distribution in the dataset is introduced as the weight of cross-entropy(CE)loss to enhance the attention to objects with small sample size.Ex-periments are conducted on the public datasets of VOC 2007(PASCAL VOC challenge 2007),MS-COCO(micro-soft common objects in context)2014 and VAW(visual attribute prediction in the wild).The results show that the proposed method improves the mean average precision(mAP)by 0.2 percentage points,0.7 percentage points and 0.9 percentage points compared with other existing advanced methods,respectively.For the MS-COCO 2014 and VAW datasets,the class-weighted cross-entropy loss improves 0.6 percentage points and 1.6 percentage points on mAP compared with the commonly used cross-entropy loss without adding any computational cost,which verifies the sophistication and effectiveness of the proposed method.

关键词

深度学习/图像识别/注意力机制/多标签分类/损失函数

Key words

deep learning/image recognition/attention mechanism/multi-label classification/loss function

分类

信息技术与安全科学

引用本文复制引用

叶庆文,张秋菊..采用通道像素注意力的多标签图像识别[J].计算机科学与探索,2024,18(8):2109-2117,9.

计算机科学与探索

OA北大核心CSTPCD

1673-9418

访问量0
|
下载量0
段落导航相关论文