计算机科学与探索2024,Vol.18Issue(9):2239-2260,22.DOI:10.3778/j.issn.1673-9418.2311105
自然场景文本检测中可微分二值化技术综述
Review of Differentiable Binarization Techniques for Text Detection in Natural Scenes
摘要
Abstract
The rich text contained in natural scenes is important for understanding the real world,but the diversity and complexity of natural scene text makes the detection task difficult.With the rise of the intelligent era,deep learning technology has brought breakthroughs for natural scene text detection,and the proposal of differentiable binariza-tion network DBNet has pushed forward the research progress of real-time demand for text detection,and many re-searchers have carried out innovative and practical researches based on the differentiable binarization technology,and achieved fruitful results.In this paper,the research on text detection algorithms based on differentiable binariza-tion technology in recent years is analyzed in depth.Firstly,the background,working principle,advantages and dis-advantages of DBNet model are briefly introduced,and the algorithms based on differentiable binarization technology are classified into five categories of feature extraction,feature fusion,post-processing,overall architecture,and training strategy according to the technical differences.The improvement methods of each category are illustrated in detailed diagrams,the mechanisms of each type of technical method are elaborated in detail,and all methods are analyzed and summarized.Secondly,the commonly used public datasets and text detection performance evaluation indices are introduced,the simulation experimental results of different methods are summarized,and several applica-tion scenarios with practical significance are listed.Finally,the future development direction of text detection in nat-ural scenes is considered,and the challenges and problems to be solved are summarized.关键词
文本检测/深度学习/计算机视觉/可微分二值化Key words
text detection/deep learning/computer vision/differentiable binarization分类
信息技术与安全科学引用本文复制引用
连哲,殷雁君,智敏,徐巧枝..自然场景文本检测中可微分二值化技术综述[J].计算机科学与探索,2024,18(9):2239-2260,22.基金项目
内蒙古师范大学研究生科研创新基金项目(TY20240031) (TY20240031)
内蒙古自治区自然科学基金(2021LHMS06009,2023MS06009,NJZZ21004). This work was supported by the Graduate Student Scientific Research Innovation Project of Inner Mongolia Normal University(TY20240031),and the Natural Science Foundation of Inner Mongolia Autonomous Region(2021LHMS06009,2023MS06009,NJZZ21004). (2021LHMS06009,2023MS06009,NJZZ21004)