哈尔滨商业大学学报(自然科学版)2025,Vol.41Issue(1):3-9,7.
局部注意力与Mogrifier-LSTM的图像描述生成方法
Image caption generation method based on local attention and Mogrifier-LSTM
摘要
Abstract
For complex public scenarios,it was more difficult for the encoder to capture image semantics due to the complex relationships between people and objects.A method for public scene image description,based on a local attention mechanism and LAM-LSTM,was proposed.By introducing local attention to focus on areas throughout the scene,the key captured information was fused with text eigenvectors and incorporated into a natural language description,enhancing the image descriptions generated by the Mogrifier-LSTM,a long and short-term memory network.Experimental validation of LAM-LSTM was conducted using evaluation indicators such as Bleu,Meteor,and CIDEr on the MSCOCO and Flickr30K public datasets.The results demonstrated that the method exhibited varying degrees of improvement compared to the baseline model,proving the method's validity.关键词
公共场景图像理解/注意力机制/文本特征/自然语言描述/图像语义Key words
public scene image understanding/attention mechanism/text features/natural language description/image semantics分类
信息技术与安全科学引用本文复制引用
丁云霞,时义舒,胡鹏,胡锐,李德权..局部注意力与Mogrifier-LSTM的图像描述生成方法[J].哈尔滨商业大学学报(自然科学版),2025,41(1):3-9,7.基金项目
安徽理工大学校级重点项目(QNZD2021-02) (QNZD2021-02)
淮南市科技计划项目(2020165,2021005) (2020165,2021005)
安徽高校自然科学研究项目(2022AH050801) (2022AH050801)
安徽理工大学引进人才基金(13210679) (13210679)