|国家科技期刊平台
首页|期刊导航|计算机科学与探索|深度学习的自然场景文本识别方法综述

深度学习的自然场景文本识别方法综述OA北大核心CSTPCD

Survey on Natural Scene Text Recognition Methods of Deep Learning

中文摘要英文摘要

自然场景文本识别在学术研究和实际应用中具有重要价值,已经成为计算机视觉领域的研究热点之一.然而,识别过程存在文本风格多样、背景环境复杂等挑战,导致识别效率和准确率不佳.传统的基于手工设计特征文本识别方法由于其有限的表示能力,不足以有效地应对复杂的自然场景文本识别任务.近年来,采用深度学习方法在自然场景文本识别中取得了重大进展,系统地梳理了近年来相关研究工作.首先,根据是否需要对单字符进行分割,将自然场景文本识别方法分为基于分割与无需分割的方法,再根据其技术实现特点将无需分割的方法进行细分,并对各类最具有代表性的方法工作原理进行了阐述.然后,介绍了当前常用数据集以及评价指标,并在数据集上对各类方法进行了性能对比,从多个方面讨论了各类方法的优势与局限性.最后,指出基于深度学习的自然场景文本识别研究存在的不足和难点,对其未来的发展趋势进行了展望.

Natural scene text recognition holds significant value in both academic research and practical applica-tions,making it one of the research hotspots in the field of computer vision.However,the recognition process faces challenges such as diverse text styles and complex background environments,leading to unsatisfactory efficiency and accuracy.Traditional text recognition methods based on manually designed features have limited representation capabilities,which are insufficient for effectively handling complex tasks in natural scene text recognition.In recent years,significant progress has been made in natural scene text recognition by adopting deep learning methods.This paper systematically reviews the recent research work in this area.Firstly,the natural scene text recognition methods are categorized into segmentation-based and non-segmentation-based approaches based on character segmentation required or not.The non-segmentation-based methods are further subdivided according to their technical implementation characteristics,and the working principles of the most representative methods in each category are described.Next,commonly used datasets and evaluation metrics are introduced,and the performance of various methods is compared on these datasets.The advantages and limitations of different approaches are discussed from multiple perspectives.Finally,the shortcomings and challenges are given,and the future development trends are also put forward.

曾凡智;冯文婕;周燕

佛山科学技术学院 计算机系,广东 佛山 528000

计算机与自动化

文本识别深度学习自然场景

text recognitiondeep learningnatural scene

《计算机科学与探索》 2024 (005)

1160-1181 / 22

国家自然科学基金(61972091);广东省自然科学基金(2022A1515010101,2021A1515012639);广东省普通高校重点研究项目(2019KZDXM007,2020ZDZX3049);佛山市科技创新项目(2020001003285);广东省教育科学规划课题(2021GXJK445). This work was supported by the National Natural Science Foundation of China(61972091),the Natural Science Foundation of Guang-dong Province(2022A1515010101,2021A1515012639),the Key Research Project of Universities of Guangdong Province(2019KZDXM007,2020ZDZX3049),the Science and Technology Innovation Project of Foshan(2020001003285),and the Educational Science Planning Project of Guangdong Province(2021GXJK445).

10.3778/j.issn.1673-9418.2306024

评论