计算机工程与科学2025,Vol.47Issue(5):894-901,8.DOI:10.3969/j.issn.1007-130X.2025.05.013
基于生成对抗网络的跨模态图像情感感知描述
Cross-modal image emotion perception captioning based on generative adversarial network
摘要
Abstract
Image captioning is a cross-modal task,which aims to produce texts conforming to the im-age content based on visual information.Although some achievements have been made in image caption-ing,it still has improved space in the aspects of fine-grained affective semantic feature capture and the emotional delicacy of descriptions.Addressing this problem,a model is proposed,which based on gen-erative adversarial network to generate aspect-level emotional language descriptions.With the codec structure integrating the two-modal attention mechanism as the generator and the convolutional neural network as the discriminator,the accuracy of the model in cross-modal emotion matching and the relia-bility of generating emotion statements are improved.Transfer learning and RMSProp optimization al-gorithm are introduced to improve the interpretability of the model.Finally,the experiment is carried out on the MSCOCO and SentiCap datasets,the model exhibits excellent convergence performance and attains a high accuracy rate.关键词
生成对抗网络/图像情感描述/RMSprop优化算法Key words
generative adversarial network/image emotion captioning/RMSProp optimization algo-rithm分类
信息技术与安全科学引用本文复制引用
杨春苗,王杨,韩力英,孙赫彬..基于生成对抗网络的跨模态图像情感感知描述[J].计算机工程与科学,2025,47(5):894-901,8.基金项目
国家自然科学基金(62241103) (62241103)
河北省教育厅资助重点项目(ZD2020304) (ZD2020304)
河北省引进留学人员资助项目(C20220316) (C20220316)