大数据2025,Vol.11Issue(6):108-122,15.DOI:10.11959/j.issn.2096-0271.2025076
基于图像分类规划学习的视觉故事生成模型
Visual storytelling based on image classification planning learning
摘要
Abstract
Most of the existing solutions directly process image information,although topic analysis,knowledge graph,and other methods are introduced,there are problems in the processing of image information,such as a single perspective,weakening the story generation process,and a lack of structured design.To solve the above problems,this paper proposes a visual story generation model based on image classification planning learning,introduces image classification and planning learning methods,divides images into seven types:people,animals,food,natural landscapes,architecture,indoor scenes,and others,and sets corresponding questions for each type,and uses the Visual Question pre-trained language model to generate answers and complete planning design,to guide visual story generation.The model is divided into four stages:the first stage extracts visual information from pictures;The second stage is to use the pre-trained language model to classify and guide the generation of planning information;the third stage updates the vocabulary information of the dataset;the fourth stage integrates the visual and planning information generated in the previous stages to complete the visual storytelling task.Compared with the existing COVS,the results of the proposed model are improved by 2.07%,4.29%,0.44%,1.78%,0.91%and 1.07%on BLEU-1,BLEU-2,CIDEr,Distinct-3,Distinct-4,and TTR.关键词
视觉故事生成/图像分类/规划学习/视觉问答Key words
storytelling/image classification/planning learning/visual question分类
计算机与自动化引用本文复制引用
王元龙,张宁倩,张虎..基于图像分类规划学习的视觉故事生成模型[J].大数据,2025,11(6):108-122,15.基金项目
国家自然科学基金项目(No.62176145,No.62476161) The National Natural Science Foundation of China(No.62176145,No.62476161) (No.62176145,No.62476161)