基于改进DETR的电力作业现场安全行为识别方法OACSTPCD
An Improved DETR-based Method for Recognizing Safety Behaviors at Electric Power Work Sites
针对电力作业现场采集的图像样本少,且正负样本分布不均衡的问题,文章提出一种改进检测变压器(detection transformer,DETR)的电力作业现场安全行为识别方法.一方面,通过多次预训练的方式提高DETR模型的泛化性能;另一方面,在DETR模型中融入Adapter模块,利用少量样本进行微调.实验结果表明,所提方法在安全帽佩戴识别、短袖短裤识别、安全带佩戴识别场景中实现了较高的检测精度.在使用100张样本训练的条件下,平均精度为0.81,比现阶段先进的目标检测模型YOLOv5和Faster R-CNN分别高0.02和 0.03;在使用300张训练样本的条件下,平均精度为0.84,比YOLOv5和Faster R-CNN分别高0.01和 0.03,基于改进的DETR方法实现了端到端的训练,通过Adapter微调能够实现其他任务场景的快速迁移.
Aiming at the problem that the image samples collected from the electric power operation site are few and the distribution of positive and negative samples is not balanced.This paper proposes an improved detection transformer(DETR)method for recognizing safety behaviors at power operation sites.On the one hand,the generalization performance of the DETR model is improved by means of multiple pre-training.On the other hand,the Adapter module is incorporated into the DETR model and a small number of samples are utilized for fine-tuning.The experimental results show that the proposed method achieves high detection accuracy in helmet wearing recognition,short-sleeve shorts recognition,and seat belt wearing recognition scenarios.Under the condition of using 100 samples for training,the average accuracy is 0.81,which is 0.02 and 0.03 higher than that of YOLOv5 and Faster R-CNN,which are the state-of-the-art target detection models at the present stage,respectively;under the condition of using 300 training samples,the average accuracy is 0.84,which is 0.01 and 0.03 higher than that of YOLOv5 and Faster R-CNN,respectively.In addition,end-to-end training is realized based on the improved DETR method,and fast migration to other task scenarios can be achieved through Adapter fine-tuning.
郭鹏天;李黎;王勇;郑碧煌;王晓辉;陈勇
中国电力科学研究院有限公司人工智能研究所,北京市 海淀区 100192国网山东省电力公司,山东省 济南市 250001
电子信息工程
DETRAdapter目标检测安全行为识别小样本学习
DETRAdapterobject detectionsafety behavior recognitionsmall sample learning
《电力信息与通信技术》 2024 (008)
63-70 / 8
国家电网有限公司总部科技项目资助"高精度电力超大规模预训练视觉模型研究及数字化应用"(5108-202218280A-2-395-XG).
评论