首页|期刊导航|重庆工商大学学报（自然科学版）|一种基于SwiftNet面向室内RGBD场景的高效语义分割算法

一种基于SwiftNet面向室内RGBD场景的高效语义分割算法

王博许钢苏世林

重庆工商大学学报（自然科学版）2025，Vol.42Issue(3)：84-93,10.

重庆工商大学学报（自然科学版）2025，Vol.42Issue(3)：84-93,10.DOI:10.16055/j.issn.1672-058X.2025.0003.011

一种基于SwiftNet面向室内RGBD场景的高效语义分割算法

An Efficient Semantic Segmentation Algorithm for Indoor RGBD Scenes Based on SwiftNet

王博 ¹许钢 ¹苏世林¹

作者信息

1. 安徽工程大学电气工程学院,安徽芜湖 241000
折叠

摘要

Abstract

Objective Existing RGBD semantic segmentation algorithms fail to fully utilize shape information provided by depth images and suffer from high computational costs,particularly for complex lighting,diverse materials,and spatial structures in indoor scenes.This paper proposed an efficient semantic segmentation method for indoor RGBD scenes based on SwiftNet.Methods Firstly,in the SwiftNet(a lightweight multi-scale road RGB scene semantic segmentation algorithm),depth images were incorporated.By leveraging the color stability of depth images and the distance information provided for each pixel relative to the camera,this approach reduced the impact of factors such as lighting,color variations,and distances on segmentation results.Next,a specialized extraction of geometric shape features from depth images was conducted.Depth features were decomposed into positional components and shape components,with two learnable weights introduced to independently collaborate with them.Convolution operations were then applied for the reweighting and combination of these two components,securing the intrinsic geometric shape information from the depth data without incurring additional computation and memory during the inference phase.Finally,to capture richer contextual information more rapidly,the depth aggregation pyramid pooling module was enhanced to extract context information in parallel,referred to as the Fast Aggregation Pyramid Pooling Module(FAPPM).Results Through evaluation experiments on the NYUv2 and SUNRGBD indoor datasets,the results demonstrated that compared with the current well-performing ESANet model,the proposed approach achieved improvements of 2.21％and 3.2％in mean intersection over union(MIoU)on these datasets,respectively.Furthermore,it achieves a processing speed of 33.36 frames per second(FPS).Conclusion The validation confirms the algorithm's efficiency and accuracy in handling complex indoor semantic segmentation tasks,providing solid support for subsequent intelligent robot tasks in indoor applications.

关键词

RGBD语义分割/形状感知卷积/室内场景/特征融合/深度学习

Key words

RGBD semantic segmentation/shape-aware convolution/indoor scene/feature fusion/deep learning

分类

信息技术与安全科学

引用本文复制引用

王博,许钢,苏世林..一种基于SwiftNet面向室内RGBD场景的高效语义分割算法[J].重庆工商大学学报（自然科学版）,2025,42(3):84-93,10.

基金项目

国家自然科学基金区域创新与发展联合基金项目(U22A2079). （U22A2079）

重庆工商大学学报（自然科学版）

ISSN：1672-058X

访问量4

下载量0

段落导航