网络安全与数据治理2024,Vol.43Issue(2):49-53,5.DOI:10.19358/j.issn.2097-1788.2024.02.008
基于深度注意力的融合全局和语义特征的图像描述模型
Deep attention-based image caption model with fusion of global and semantic feature
摘要
Abstract
Aiming at the problems that existing image caption generation models face limitations when utilizing global features due to the fixed receptive field size,and object region-based image features lack background information,an image caption model(DFGS)is proposed.A multi-feature fusion module is designed to fuse global and semantic feature,allowing the model to focus on key object and background information in the image.A deep attention-based decoding module is designed to align visual and textual features,enhancing the generation of higher-quality image description statements.Experimental results on MSCOCO data-set show that the proposed model can produce more accurate captions,and is competitive compared with other advanced models.关键词
图像描述/全局特征/语义特征/特征融合Key words
image caption/global feature/semantic feature/feature fusion分类
信息技术与安全科学引用本文复制引用
及昕浩,彭玉青..基于深度注意力的融合全局和语义特征的图像描述模型[J].网络安全与数据治理,2024,43(2):49-53,5.基金项目
河北省研究生创新项目(220056) (220056)