福建师范大学学报(自然科学版)2025,Vol.41Issue(1):21-29,9.DOI:10.12046/j.issn.1000-5277.2023110029
基于多头注意力的场景文本图像超分辨率网络
Scene Text Image Super-Resolution Network Based on Multi-head Attention
摘要
Abstract
Scene text image super-resolution(STISR)aims to enhance the resolution and readability of low-resolution text images,which serves as a foundational step for downstream text recognition tasks.Existing STISR methods based on deep convolutional neural networks often lack consideration of the global information of text images,leading to unstable restoration results,espe-cially for visually similar low-resolution text images.To address this problem,a novel STISR net-work is proposed,which includes a text prior(TP)module and a hybrid backbone network.The TP module generates text prior information by extracting semantic features from text images,while the hybrid backbone network fuses the text prior information with multi-scale image features,consis-ting of convolutional modules and multi-head attention fusion modules.Experimental results on the TextZoom dataset show that our proposed MASRN can restore higher quality text images and effec-tively improve the recognition accuracy in downstream text recognition tasks.关键词
场景文本/图像超分辨率/文本识别/文本先验/卷积网络/多头注意力Key words
scene text/image super-resolution/text recognition/text prior/convolutional neural networks/multi-head attention分类
计算机与自动化引用本文复制引用
朱佳楠,邢树礼..基于多头注意力的场景文本图像超分辨率网络[J].福建师范大学学报(自然科学版),2025,41(1):21-29,9.基金项目
国家重点研发计划项目(2019YFD0900905) (2019YFD0900905)
国家自然科学基金项目(61773415) (61773415)