计算机工程2025,Vol.51Issue(6):375-384,10.DOI:10.19678/j.issn.1000-3428.0069168
基于视觉-语言预训练模型的开集交通目标检测算法
Open-Set Traffic Object Detection Algorithm Based on Vision-Language Pre-training Model
摘要
Abstract
Traffic object detection is a crucial component of intelligent transportation systems.However,existing traffic object detection algorithms can only detect predefined objects and are incapable of handling open-set object scenarios.To address this,a novel open-set traffic object detection algorithm based on a Visual-Language Pre-trained(VLP)model is proposed.First,by leveraging Faster R-CNN as a foundation,the prediction network is modified to adapt to the localization challenges of open-set objects.The loss function is refined to the Intersection over Union(IoU)loss,effectively enhancing the localization accuracy.Second,a new VLP-based Label Matching Network(VLP-LMN)is constructed to perform label matching on the predicted bounding boxes.The VLP model serves as a potent knowledge repository that effectively matches regional images with labelled text.Simultaneously,prompt engineering and fine-tuning of network modules facilitate better exploration of the VLP model's performance,significantly improving the accuracy of label matching.The algorithm achieves an average detection accuracy of 60.3%for new classes on the PASCAL VOC07+12 dataset,demonstrating its commendable performance in open-set object detection.Additionally,the average detection accuracy for new classes on a traffic dataset reaches 58.9%,with only a 14.5%decrease compared with the base classes in zero-shot detection.This underscores the strong generalization capabilities of the algorithm in traffic object detection.关键词
视觉-语言预训练模型/Faster R-CNN/开集目标检测/交通目标检测Key words
Visual-Language Pre-trained(VLP)model/Faster R-CNN/open-set object detection/traffic object detection分类
信息技术与安全科学引用本文复制引用
黄琦强,安国成,熊刚..基于视觉-语言预训练模型的开集交通目标检测算法[J].计算机工程,2025,51(6):375-384,10.基金项目
"十四五"国家重点研发计划(2023YFC3006700) (2023YFC3006700)
国家自然科学基金(62071293). (62071293)