计算机工程与应用2026,Vol.62Issue(1):1-19,19.DOI:10.3778/j.issn.1002-8331.2504-0330
基于深度学习的单目视觉目标检测综述
Review of Monocular Vision Object Detection Based on Deep Learning
摘要
Abstract
Monocular visual object detection,characterized by its low hardware cost and high real-time performance,has gradually become a core technology in fields such as autonomous driving and intelligent surveillance,playing an indis-pensable role.However,issues such as geometric ambiguity,occlusion robustness,and small object detection accuracy remain significant bottlenecks in current research.This paper primarily focuses on algorithm-level advancements,system-atically quantifying the progress in monocular visual object detection technology from three perspectives:algorithm evolu-tion,performance evaluation,and lightweight design.Firstly,single-stage detection algorithms are deconstructed into classic convolutional architectures and Transformer architectures,analyzing and summarizing their structural innovations and performance bottlenecks,revealing the trade-off patterns between accuracy,speed,and complexity.Secondly,from the three levels of network design,algorithm optimization,and model compression,this paper explores the integration strategies of lightweight technologies with object detection algorithms.It also consolidates multi-dimensional evaluation metrics from the three main official datasets used for training and evaluation in object detection,building a cross-model comparison framework based on the MS-COCO-2017 dataset to conduct a horizontal performance comparison of differ-ent single-stage detectors.Finally,the paper looks ahead to cutting-edge directions such as multimodal fusion and light-weight improvements,aiming to provide systematic references for the engineering application and theoretical break-throughs of monocular visual object detection algorithms.关键词
单阶段目标检测/深度学习/轻量化模型/经典卷积架构/Transformer架构Key words
one-stage object detection/deep learning/lightweight model/classic convolution architecture/Transformer architecture分类
信息技术与安全科学引用本文复制引用
刘桂超,王怀光,任国全,吴定海..基于深度学习的单目视觉目标检测综述[J].计算机工程与应用,2026,62(1):1-19,19.基金项目
科技创新研究计划项目([2022]40) ([2022]40)
国家自然科学基金(51305454) (51305454)
陆军工程大学石家庄校区自立项目. ()