计算机与现代化Issue(5):16-21,6.DOI:10.3969/j.issn.1006-2475.2024.05.004
多尺度特征融合的版面分析方法
Layout Analysis Method of Multi-scale Feature Fusion
摘要
Abstract
Aiming at the problems of list and text misclassification,the difficulty of recognizing small-scale text in tables,and the poor preservation of spatial features in the current document layout element analysis,according to bottom-up thinking,the paper proposes a multi-feature fusion layout analysis method based on SegNet network.In this paper,the MSCAN-SE module is introduced into SegNet to solve the problem of low recognition rate of small-scale elements in tables.The strip features in the at-tention mechanism MSCAN-SE are used to improve the extraction ability of multi-scale features of the model,so that the net-work can retain feature information of more scales.Aiming at the problem that the features of list elements and text elements are too similar,the receptive field of the network in the feature extraction process is expanded through the dilated convolution and channel attention branch in the attention mechanism MSCAN-SE.The performance of the proposed method is compared with the classical semantic segmentation network through experiments.The results show that the pixel accuracy of the proposed method on the test set of layout analysis is 97.9%,and the mean intersection over union ratio is 91.7%.Compared with U-Net semantic seg-mentation model,FCN semantic segmentation model,DeepLabV3+semantic segmentation model,and SegNet semantic segmen-tation model,the mean intersection and union ratio is increased by 7.6%,2.4%,2.6%and 1.5%respectively.关键词
版面分析/多尺度注意力/语义分割/通道注意力Key words
document layout analysis/multi-scale attention/semantic segmentation/channel attention分类
信息技术与安全科学引用本文复制引用
乔佳,徐琨,胡佩蓉..多尺度特征融合的版面分析方法[J].计算机与现代化,2024,(5):16-21,6.基金项目
国家自然科学基金资助项目(52172302) (52172302)
国家重点研发计划项目(2019YFB1600103) (2019YFB1600103)
陕西省重点研发计划项目(2018ZDXM-GY-044) (2018ZDXM-GY-044)