| 注册
首页|期刊导航|西安电子科技大学学报(自然科学版)|基于多边形特征池化与融合的复杂文本检测

基于多边形特征池化与融合的复杂文本检测

张相南 高新波 田春娜

西安电子科技大学学报(自然科学版)2024,Vol.51Issue(3):113-123,11.
西安电子科技大学学报(自然科学版)2024,Vol.51Issue(3):113-123,11.DOI:10.19665/j.issn1001-2400.20230801

基于多边形特征池化与融合的复杂文本检测

Complex text region detection based on polygon feature pooling and the transformer

张相南 1高新波 2田春娜1

作者信息

  • 1. 西安电子科技大学 电子工程学院,陕西 西安 710071
  • 2. 重庆邮电大学 计算机科学与技术学院 图像认知重庆市重点实验室,重庆 400065
  • 折叠

摘要

Abstract

Text detection plays an important role in image understanding,and deep-learning-based algorithms are popular methods including single-stage and two-stage methods.Usually,two-stage based text detection methods have a higher accuracy than the single stage based methods.The two-stage text detection method usually contains the feature pooling operation in the region of interests(RoI),which provides the local region features with fixed dimensions for further detection and recognition tasks.However,for complex text areas such as a curved text,the existing pooling methods based on the rectangular RoI are no longer applicable.Using point features instead of area features to solve the problem loses spatial information compared with area features.To address this issue,we propose a complex text region detection method based on polygon feature pooling and Transformer.First,we extend the feature pooling shape of RoI from the rectangle to the polygon,which does not need any shape fitting.and the features of polygon RoI with fixed dimensions are pooled,which avoids the error in the fitting process.Furthermore,the pooled polygon region features are regarded as context-sensitive sequences,which are input to the Transformer to fuse the context of the visual feature to reduce the training difficulties and improves the detection accuracy.Our experiments on the complex text region datasets,such as ICDAR2015,MLT,Total Text and CTW1500,show that the proposed two-stage detection algorithm can extract the features of RoI very well and achieves better detection results than the state-of-the-art methods.

关键词

文本检测/双阶段方法/多边形/特征池化/Transformer

Key words

text region detection/two-stage methods/polygon/feature pooling/Transformer

分类

信息技术与安全科学

引用本文复制引用

张相南,高新波,田春娜..基于多边形特征池化与融合的复杂文本检测[J].西安电子科技大学学报(自然科学版),2024,51(3):113-123,11.

基金项目

国家自然科学基金(62173265,62036007) (62173265,62036007)

西安电子科技大学学报(自然科学版)

OA北大核心CSTPCD

1001-2400

访问量0
|
下载量0
段落导航相关论文