| 注册
首页|期刊导航|计算机技术与发展|真实场景文本图像的超分辨率重建网络

真实场景文本图像的超分辨率重建网络

卫鸿婧 柴锐 赵鹏程 张美美 杨思燕 秦品乐

计算机技术与发展2025,Vol.35Issue(11):61-68,8.
计算机技术与发展2025,Vol.35Issue(11):61-68,8.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0318

真实场景文本图像的超分辨率重建网络

Super-resolution Reconstruction Network for Text Images of Real Scenes

卫鸿婧 1柴锐 2赵鹏程 2张美美 2杨思燕 3秦品乐2

作者信息

  • 1. 中国移动通信集团山西有限公司,山西 太原 030000
  • 2. 中北大学 计算机科学与技术学院,山西 太原 030051
  • 3. 陕西开放大学 信息化处,陕西 西安 710119
  • 折叠

摘要

Abstract

To address the key issues in super-resolution(SR)reconstruction of text images in real scenes,such as significant differences in text spatial distribution and texture details,easy generation of unnatural artifacts,and low feature extraction efficiency,we propose a text-enhanced super-resolution generative adversarial network(Text-ESRGAN).Firstly,to fully extract multi-scale image information,the network's receptive field is expanded without losing spatial resolution.By embedding Dense Atrous Spatial Pyramid Pooling(DenseASPP)into the generator network,the identifiability of features is significantly improved.Secondly,combining the dense residual module and channel attention mechanism,a feature extraction module named nRRCADB is designed and proposed.Meanwhile,Gaussian noise is introduced into the generator network.This module can accurately extract key image features,reduce the network's attention to invalid background features,enrich the local texture details of the image,and promote the model to generate high-resolution text images that are more consistent with real scenes.Finally,to solve the problem of unnatural artifacts caused by spatial deformation in text image SR reconstruction,a Spatial Transformer Network(STN)module based on thin-plate spline interpolation is introduced,which effectively enhances the recognizability of the generated text.Experimental and ablation results on the TextZoom dataset show that the proposed model can effectively improve the detail authenticity of text images and the recognizability of text.The generated images achieve an average Peak Signal-to-Noise Ratio(PSNR)of 22.46 dB and a Structural Similarity Index Measure(SSIM)of 0.76,and its performance is superior to that of existing mainstream methods.

关键词

生成对抗网络/深度学习/多尺度特征/感受野/文本图像超分辨率

Key words

generative adversarial network/deep learning/multi-scale features/receptive field/text image super-resolution

分类

计算机与自动化

引用本文复制引用

卫鸿婧,柴锐,赵鹏程,张美美,杨思燕,秦品乐..真实场景文本图像的超分辨率重建网络[J].计算机技术与发展,2025,35(11):61-68,8.

基金项目

山西省科技重大专项计划(202101010101018) (202101010101018)

山西省基础研究计划(20210302123033) (20210302123033)

计算机技术与发展

1673-629X

访问量0
|
下载量1
段落导航相关论文