计算机工程与应用2024,Vol.60Issue(24):44-64,21.DOI:10.3778/j.issn.1002-8331.2405-0048
基于扩散模型的文本图像生成对比研究综述
Comparative Review of Text-to-Image Generation Techniques Based on Diffusion Models
摘要
Abstract
With the continuous development of deep learning,artificial intelligence generated content has become a hot topic,especially diffusion models,as an emerging generation model,have made significant progress in the field of text-to-image generation.This article comprehensively describes the application of diffusion models in text and image generation tasks,and compares them with generative adversarial networks and autoregressive models,revealing the advantages and limitations of diffusion models.Meanwhile,it delves into the specific methods of diffusion models in improving image quality,optimizing model efficiency and generating images from multilingual text prompts.Experimental analyses on CUB,COCO and T2I-CompBench datasets not only validates the zero-shot generation capability of diffusion models but also highlights their ability to generate high-quality images based on complex text prompts.The paper introduces the promising applications of diffusion models in fields such as text-guided image editing,3D generation,video generation,and medical image generation.It summarizes the challenges faced by diffusion models in text-to-image generation tasks and their future development trends,aiming to facilitate further research in this domain.关键词
文本图像生成/扩散模型/生成对抗网络/自回归模型Key words
text-to-image generation/diffusion models/generative adversarial networks/autoregressive models分类
信息技术与安全科学引用本文复制引用
高欣宇,杜方,宋丽娟..基于扩散模型的文本图像生成对比研究综述[J].计算机工程与应用,2024,60(24):44-64,21.基金项目
国家自然科学基金(62062058) (62062058)
宁夏重点研发项目(2023BEG02009). (2023BEG02009)