|国家科技期刊平台
首页|期刊导航|现代电子技术|基于金字塔结构的Transformer边缘检测算法研究

基于金字塔结构的Transformer边缘检测算法研究OA北大核心CSTPCD

Research on Transformer edge detection algorithm based on pyramid structure

中文摘要英文摘要

针对复杂图像边缘检测任务中多尺度特征提取困难和多尺度特征利用率低的问题,提出一种基于金字塔结构的Transformer边缘检测模型.该模型首先采用擅长根据全局远程依赖关系进行建模的Transformer特征提取主干——PVT网络,取代传统卷积神经网络,解决多尺度特征利用率低的问题;其次,为了充分考虑跨层间上下文特征交互问题,设计了一个专门用来建模和转移上下文知识的模块,用于探索更多显著边缘的判别信息;最后,设计了一个基于注意力机制的多尺度特征增强模块,通过充分挖掘检测对象的多层次和多尺度特征信息,实现对边缘的预测,提高模型边缘检测精度.而且,模型的特征求和与拼接过程不占显存也不占内存,加快了模型的推理速度.在BSDS500和BIPED两个公开数据集上进行大量实验,在BSDS500数据集上边缘检测的ODS值达到0.796;在BIPED数据集上边缘检测的ODS值达到了0.846,实验结果表明该算法在性能上优于对比模型.

In view of the difficult multi-scale feature extraction and low utilization rate of multi-scale features in the task of complex image edge detection,a Pyramid-structured Transformer edge detection model is proposed.In this model,the Transformer feature extraction trunk named PVT(pyramid vision transformer)network,which is good at modeling based on global long-range dependency relationships,is adopted to replace the traditional convolutional neural network(CNN),so as to improve the utilization rate of multi-scale features.A module specifically designed for modeling and transferring context knowledge is designed to explore more discriminant information of significant edges,so as to take account of the cross-layer context feature interaction between layers fully.A multi-scale feature enhancement module(MSFEM)based on the attention mechanism(AM)is designed to achieve the prediction of edges by fully exploring the multi-level and multi-scale feature information of the objects under detection,and to increase the edge detection accuracy of the model.Moreover,the feature summing and stitching process of the model does not occupy video memory nor memory,and speeds up the model inference speed.A large number of experiments were carried out on the two public datasets BSDS500 and BIPED.The ODS(optimal dataset scale)value of edge detection on the dataset BSDS500 reached 0.796;and on the dataset BIPED,the ODS value of edge detection reached 0.846.The experimental results show that the proposed algorithm is superior to the bechmark model in performance.

段续延;于复兴;索依娜

华北理工大学 人工智能学院,河北 唐山 063210华北理工大学 人工智能学院,河北 唐山 063210||河北省工业智能感知重点实验室,河北 唐山 063210

电子信息工程

边缘检测Transformer多尺度特征提取卷积神经网络PVT多尺度特征增强

edge detectionTransformermulti-scale feature extractionCNNPVTmulti-scale feature enhancement

《现代电子技术》 2024 (019)

131-138 / 8

10.16652/j.issn.1004-373x.2024.19.020

评论