首页|期刊导航|中南民族大学学报（自然科学版）|基于增强FPN的Vision Transformer在文档布局分析任务中的应用研究

基于增强FPN的Vision Transformer在文档布局分析任务中的应用研究

张法李艳红吴龙雨龙焓

中南民族大学学报（自然科学版）2026，Vol.45Issue(4)：548-558,11.

中南民族大学学报（自然科学版）2026，Vol.45Issue(4)：548-558,11.DOI:10.20056/j.cnki.ZNMDZK.20260708

基于增强FPN的Vision Transformer在文档布局分析任务中的应用研究

Application research of Vision Transformer with enhanced FPN in document layout analysis tasks

张法 ¹李艳红 ¹吴龙雨 ¹龙焓¹

作者信息

1. 中南民族大学计算机学院,湖北武汉 430074
折叠

摘要

Abstract

Compared with traditional methods based on Convolutional Neural Networks(CNN),the document layout analysis model based on Vision Transformer can provide robust semantic and visual representations for downstream tasks through multi-modal pre-training mechanisms.However,the current multi-scale feature generation module and cross-resolution feature fusion process are prone to causing the loss of category attributes and boundary details,which in turn leads to issues such as category confusion and blurred boundaries.To address this bottleneck,Local Feature Enhancement Generation(LFEG)and Global-to-Local Feature Enhancement Fusion(GLEF)techniques are proposed to construct an enhanced Feature Pyramid Network(FPN)structure for achieving novel feature optimization.Specifically,the LFEG module optimizes four resolution modification modules to generate the multi-scale feature,while the GLEF module optimizes the traditional top-down fusion approach.Experimental results demonstrate that the proposed enhanced FPN structure can effectively improve the category consistency and boundary clarity of multi-scale feature maps,providing key technical support for optimizing the accuracy of document layout analysis based on Vision Transformer.

关键词

视觉变换器/特征金字塔网络/特征融合/文档布局分析

Key words

Vision Transformer/FPN/feature fusion/document layout analysis

分类

信息技术与安全科学

引用本文复制引用

张法,李艳红,吴龙雨,龙焓..基于增强FPN的Vision Transformer在文档布局分析任务中的应用研究[J].中南民族大学学报（自然科学版）,2026,45(4):548-558,11.

基金项目

湖北省自然科学基金资助项目(2017CFB135) （2017CFB135）

中央高校基本科研业务费专项资金资助项目(CZY23019) （CZY23019）

网络创新及应用型人才课程实践教学研究项目(2019年第一批) （2019年第一批）

中南民族大学学报（自然科学版）

ISSN：1672-4321

访问量0

下载量0

段落导航