首页|期刊导航|计算机与现代化|基于改进Stable Diffusion的时尚服饰图案生成

基于改进Stable Diffusion的时尚服饰图案生成

赵晨阳薛涛刘俊华

计算机与现代化Issue(12)：15-23,115,10.

计算机与现代化Issue(12)：15-23,115,10.DOI:10.3969/j.issn.1006-2475.2024.12.003

基于改进Stable Diffusion的时尚服饰图案生成

Fashion Clothing Pattern Generation Based on Improved Stable Diffusion

赵晨阳 ¹薛涛 ¹刘俊华¹

作者信息

1. 西安工程大学计算机科学学院,陕西西安 710048
折叠

摘要

Abstract

Dress pattern is a window for people to show their personality and fashion.In recent years,with the continuous devel-opment of multimodal technology,text-based dress pattern generation has been well studied.However,the existing methods have not been well applied due to the problems of combining poor semanticity and low resolution.After the large-scale language-image pre-training model CLIP was proposed,various pre-training diffusion models combined with CLIP for text-image genera-tion tasks have become the mainstream methods in this field.However,the original pre-training models have poor generalization ability to the downstream task,relying solely on the pre-training model does not allow flexible and accurate control of the color and structure of the dress pattern,and its large number of parameters is difficult to re-train from scratch.To solve the above prob-lems,this study designs a Stable Diffusion-improved network FT-SDM-L(Fine Tuning-Stable Diffusion Model-Lion),which uses the dress image text dataset to update the weights of the cross-attention module in the original model.The experimental re-sults show that the ClipScore and HPS v2 scores of the fine-tuned model are improved by 0.08 and 1.22 on average,which vali-dates the important ability of this module in combining textual information.Subsequently,to further enhance the model's feature extraction and data mapping capabilities in the apparel domain,a lightweight adapter,Stable-Adapter,was designed to be added to the module's output location to maximize the sensing of changes in the input cues.By adding only 0.75%extra param-eters to the adapter,the ClipScore and HPS v2 scores of the model can be further improved by 0.05,0.38.Good results are achieved in terms of fidelity and semantic consistency of clothing pattern generation

关键词

文本图像生成/扩散模型/交叉注意力机制/图像生成/计算机视觉

Key words

text image generation/diffusion model/cross-attention mechanism/image generation/computer vision

分类

信息技术与安全科学

引用本文复制引用

赵晨阳,薛涛,刘俊华..基于改进Stable Diffusion的时尚服饰图案生成[J].计算机与现代化,2024,(12):15-23,115,10.

基金项目

国家自然科学基金青年科学基金资助项目(62202366) （62202366）

陕西省技术创新引导专项计划资助项目(2020CGXNG-012) （2020CGXNG-012）

计算机与现代化

OACSTPCD

ISSN：1006-2475

访问量0

下载量0

段落导航