|国家科技期刊平台
首页|期刊导航|计算机科学与探索|自然场景文本检测中可微分二值化技术综述

自然场景文本检测中可微分二值化技术综述OA北大核心CSTPCD

Review of Differentiable Binarization Techniques for Text Detection in Natural Scenes

中文摘要英文摘要

自然场景中包含的丰富文本对理解现实世界具有重要意义,但由于自然场景文本的多样性和复杂性,检测任务变得困难.随着智能时代的兴起,深度学习技术为自然场景文本检测带来突破性进展,可微分二值化网络DBNet的提出,更是推动了文本检测实时性需求的研究进步,许多研究者基于可微分二值化技术,进行了具有创新性和实用性的研究,并取得丰硕成果.对近年来基于可微分二值化技术的文本检测算法研究进行了深入的分析和总结.简要介绍DBNet模型的背景、工作原理、优势与劣势,根据技术差异将基于微分二值化技术的算法分为特征提取、特征融合、后处理、整体架构以及训练策略五类,对每类方法的改进方式进行详细的图示说明,并对各类技术方法的机制进行详细阐述,对所有方法进行分析总结.介绍了常用公开数据集和文本检测性能评估指标,汇总不同方法的仿真实验结果,列举几个具有实际意义的应用场景.对自然场景文本检测领域的未来发展方向进行了思考,并梳理面对的挑战和亟待解决的问题.

The rich text contained in natural scenes is important for understanding the real world,but the diversity and complexity of natural scene text makes the detection task difficult.With the rise of the intelligent era,deep learning technology has brought breakthroughs for natural scene text detection,and the proposal of differentiable binariza-tion network DBNet has pushed forward the research progress of real-time demand for text detection,and many re-searchers have carried out innovative and practical researches based on the differentiable binarization technology,and achieved fruitful results.In this paper,the research on text detection algorithms based on differentiable binariza-tion technology in recent years is analyzed in depth.Firstly,the background,working principle,advantages and dis-advantages of DBNet model are briefly introduced,and the algorithms based on differentiable binarization technology are classified into five categories of feature extraction,feature fusion,post-processing,overall architecture,and training strategy according to the technical differences.The improvement methods of each category are illustrated in detailed diagrams,the mechanisms of each type of technical method are elaborated in detail,and all methods are analyzed and summarized.Secondly,the commonly used public datasets and text detection performance evaluation indices are introduced,the simulation experimental results of different methods are summarized,and several applica-tion scenarios with practical significance are listed.Finally,the future development direction of text detection in nat-ural scenes is considered,and the challenges and problems to be solved are summarized.

连哲;殷雁君;智敏;徐巧枝

内蒙古师范大学 计算机科学技术学院,呼和浩特 010022

计算机与自动化

文本检测深度学习计算机视觉可微分二值化

text detectiondeep learningcomputer visiondifferentiable binarization

《计算机科学与探索》 2024 (009)

2239-2260 / 22

内蒙古师范大学研究生科研创新基金项目(TY20240031);内蒙古自治区自然科学基金(2021LHMS06009,2023MS06009,NJZZ21004). This work was supported by the Graduate Student Scientific Research Innovation Project of Inner Mongolia Normal University(TY20240031),and the Natural Science Foundation of Inner Mongolia Autonomous Region(2021LHMS06009,2023MS06009,NJZZ21004).

10.3778/j.issn.1673-9418.2311105

评论