| 注册
首页|期刊导航|计算机技术与发展|基于语义空间感知与注意力的文本生成图像方法

基于语义空间感知与注意力的文本生成图像方法

欧阳安杰 孙大盟 何立明

计算机技术与发展2025,Vol.35Issue(3):109-116,8.
计算机技术与发展2025,Vol.35Issue(3):109-116,8.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0355

基于语义空间感知与注意力的文本生成图像方法

Semantic Spatial Awareness and Attention-based Text-to-Image Generation Method

欧阳安杰 1孙大盟 1何立明1

作者信息

  • 1. 长安大学 信息工程学院,陕西 西安 710018
  • 折叠

摘要

Abstract

In the task of text image generation,there exist the phenomenon of mismatch between image and text description and the phe-nomenon of poor image generation quality.In order to improve the matching degree between text and generated images and generate higher quality generated images,a novel generative adversarial network model(WSA-GAN)is proposed.The embedding vector encoded by the word text is fused with the hidden features of the image effectively through the cross-attention method and the confidence feature fusion method.At the same time,the semantic spatial-aware convolution module(SSACN)is introduced and improved,and deep separable convolution is used to replace ordinary convolution to reduce the number of model parameters and achieve the purpose of improving the complexity of the model.Self-attention and convolution mixing(ACMix)is used to capture the relationship between each pixel in the image features,and the long-distance relationship between the features is modeled under the condition of ensuring the complexity of the model,so that the model can capture a wider range of context information,improving the alignment between the text and the generated image while improving the image quality.By verifying on CUB-200-2011 data set,compared to mainstream models,the quality of generation and the alignment with the text have both improved to some extent.

关键词

生成对抗网络/多模态融合/注意力机制/文本描述生成图像/深度学习

Key words

generative adversarial networks/multi-modality fusion/attention mechanism/text description generated image/deep learning

分类

计算机与自动化

引用本文复制引用

欧阳安杰,孙大盟,何立明..基于语义空间感知与注意力的文本生成图像方法[J].计算机技术与发展,2025,35(3):109-116,8.

基金项目

陕西省重点研发计划项目(2022GY-030,2022GY-039) (2022GY-030,2022GY-039)

计算机技术与发展

1673-629X

访问量0
|
下载量0
段落导航相关论文