| 注册
首页|期刊导航|光学精密工程|基于跨层次聚合网络的实时城市街景语义分割

基于跨层次聚合网络的实时城市街景语义分割

侯志强 程敏婕 马素刚 屈敏杰 杨小宝

光学精密工程2024,Vol.32Issue(8):1212-1226,15.
光学精密工程2024,Vol.32Issue(8):1212-1226,15.DOI:10.37188/OPE.20243208.1212

基于跨层次聚合网络的实时城市街景语义分割

Real-time urban street view semantic segmentation based on cross-layer aggregation network

侯志强 1程敏婕 1马素刚 1屈敏杰 1杨小宝1

作者信息

  • 1. 西安邮电大学 计算机学院,陕西 西安 710121||西安邮电大学 陕西省网络数据分析与智能处理重点实验室,陕西 西安 710121
  • 折叠

摘要

Abstract

With the rapid development of autonomous driving technology,precise and efficient scene un-derstanding has become increasingly important.Urban street scene semantic segmentation aims to accu-rately identify and segment elements such as pedestrians,obstacles,roads,and signs,providing necessary road information for autonomous driving technology.However,current semantic segmentation algorithms still face challenges in urban street scene segmentation,mainly manifested in issues such as insufficient dis-crimination between different categories of pixels,inaccurate understanding of complex scene structures,and inaccurate segmentation of small-scale objects or large-scale structures.To address these issues,this paper proposed a real-time urban street scene semantic segmentation algorithm based on a cross-layer ag-gregation network.Firstly,a pyramid pooling module combined with cross-layer aggregation was de-signed at the end of the encoder to efficiently extract multi-scale context information.Secondly,a cross-layer aggregation module was designed between the encoder and decoder,which enhances the representa-tion ability of information by introducing a channel attention mechanism and gradually aggregates the fea-tures of the encoder stage to fully achieve feature reuse.Finally,a multi-scale fusion module was designed in the decoder stage,which aggregates global and local information in the channel dimension to promote the fusion of deep and shallow features.The proposed algorithm was validated on two common urban street scene datasets.On an RTX 3090 graphics card(TensorRT speed measurement environment),the algorithm achieves 73.0%mIoU accuracy on the Cityscapes test set with real-time performance of 294 FPS,and 75.8%mIoU accuracy on higher resolution images with real-time performance of 164 FPS;on the CamVid dataset,it achieves 74.8%mIoU accuracy with real-time performance of 239 FPS.Experi-mental results show that the proposed algorithm effectively balances accuracy and real-time performance,significantly improving semantic segmentation performance compared to other algorithms,and bringing new breakthroughs to the field of real-time urban street scene semantic segmentation.

关键词

语义分割/卷积神经网络/城市街景/编码器-解码器结构/金字塔池化模块

Key words

semantic segmentation/convolutional neural network/urban street view/encoder-decoder structure/pyramid pooling module

分类

计算机与自动化

引用本文复制引用

侯志强,程敏婕,马素刚,屈敏杰,杨小宝..基于跨层次聚合网络的实时城市街景语义分割[J].光学精密工程,2024,32(8):1212-1226,15.

基金项目

国家自然科学基金资助项目(No.62072370) (No.62072370)

陕西省自然科学基金项目(No.2023-JC-YB-598) (No.2023-JC-YB-598)

光学精密工程

OA北大核心CSTPCD

1004-924X

访问量0
|
下载量0
段落导航相关论文