首页|期刊导航|测试技术学报|基于CLIP模型和文本重建的人脸图像生成方法研究

基于CLIP模型和文本重建的人脸图像生成方法研究

李源凡张丽红

测试技术学报2024，Vol.38Issue(2)：154-160,7.

测试技术学报2024，Vol.38Issue(2)：154-160,7.DOI:10.3969/j.issn.1671-7449.2024.02.008

基于CLIP模型和文本重建的人脸图像生成方法研究

Research on Face Image Generation Method Based on CLIP Model and Text Reconstruction

李源凡 ¹张丽红¹

作者信息

1. 山西大学物理电子工程学院,山西太原 030006
折叠

摘要

Abstract

To address the problems of inconsistency between generated images and text descriptions and low image resolution in text-generated face methods,this paper proposes a cross-modal text-generated face image network framework.Firstly,the CLIP pre-training model is adopted to extract features from the text,and the text semantic features are enhanced by the conditional enhancement module to generate hidden vectors;then the hidden vector is projected into the implicit space of the pre-trained model Style-GAN by the mapping network to obtain the untangled hidden vector,which is input to the StyleGAN gen-erator to generate high-resolution face images;finally,the text reconstruction module is adopted to regen-erate the face images into text,and the semantic alignment loss between the reconstructed text and the input text is calculated and utilized as semantic supervision to guide the network training.The training and testing are performed on two datasets,Multi-Modal CelebA-HQ and CelebAText-HQ,and the experi-mental results show that compared with other methods,the method in this paper can generate high-resolution face images that are more consistent with the text description.

关键词

文本生成人脸/跨模态/CLIP预训练/文本重建/文本映射

Key words

text-generated face/cross-modality/CLIP pre-training/text reconstruction/text mapping

分类

信息技术与安全科学

引用本文复制引用

李源凡,张丽红..基于CLIP模型和文本重建的人脸图像生成方法研究[J].测试技术学报,2024,38(2):154-160,7.

基金项目

山西省高等学校教学改革创新项目(J2021086) （J2021086）

山西省研究生创新项目(2021Y154) （2021Y154）

测试技术学报

OACSTPCD

ISSN：1671-7449

访问量1

下载量0

段落导航