|国家科技期刊平台
首页|期刊导航|电子科技|基于锚点的快速三维手部关键点检测算法

基于锚点的快速三维手部关键点检测算法OA

Research on Fast 3D Hand Keypoint Detection Algorithm Based on Anchor

中文摘要英文摘要

在人机协作任务中,手部关键点检测为机械臂提供目标点坐标,A2J(Anchor-to-Joint)是具有代表性的一种利用锚点进行关键点检测的方法.A2J以深度图为输入,可实现较好的检测效果,但对全局特征获取能力不足.文中设计了全局-局部特征融合模块(Global-Local Feature Fusion,GLFF)对骨干网络浅层和深层的特征进行融合.为了提升检测速度,文中将A2J的骨干网络替换为ShuffleNetv2 并对其进行改造,用5×5 深度可分离卷积替换3×3深度可分离卷积,增大感受野,有效提升了骨干网络对全局特征的提取能力.文中在锚点权重估计分支引入高效通道注意力模块(Efficient Channel Attention,ECA),提升了网络对重要锚点的关注度.在主流数据集ICVL和NYU上进行的训练和测试结果表明,相比于A2J,文中所提方法的平均误差分别降低了 0.09 mm和 0.15 mm.在GTX1080Ti显卡上实现了 151 frame·s-1的检测速率,满足人机协作任务对于实时性的要求.

In human-robotcollaboration tasks,hand key point detection provides target point coordinates for the robotic arm.A2J(Anchor-to-Joint)is a representative method of key point detection using anchor points.A2J can achieve better detection effect with depth map input,but it has insufficient ability to acquire global features.In this study,a GLF(Global-Local Feature Fusion)module is designed to fuse the shallow and deep features of the backbone network.In order to improve the detection speed,the backbone network of A2J is replaced with Shuffle-Netv2 and reformed,and 3×3 depth separable convolution is replaced with 5×5 depth separable convolution to in-crease the sensitivity field and effectively improve the backbone network's ability to extract global features.ECA(Effi-cient Channel Attention)is introduced into the anchor weight estimation branch to improve the network's attention to important anchor points.The results of training and testing on the mainstream data sets ICVL and NYU show that the average error of the proposed method is reduced by 0.09 mm and 0.15 mm,respectively,compared with A2J.The detection rate of 151 frame·s-1 is realized on GTX1080Ti graphics card,which fully meets the real-time require-ment of man-machine collaboration task.

秦晓飞;何文;班东贤;郭宏宇;于景

上海理工大学 光电信息与计算机工程学院,上海 200093

计算机与自动化

人机协作三维手部关键点检测锚点深度图全局-局部特征融合ShuffleNetv2深度可分离卷积高效通道注意力

human-robot collaboration3D hand keypoint detectionanchor pointdepth mapglobal-local feature fusionShuffleNetv2depthwise separable convolutionefficient channel attention

《电子科技》 2024 (004)

77-86 / 10

国家自然科学基金(92048205);国家留学基金(202008310014)National Natural Science Foundation of China(92048205);China Scholarship(202008310014)

10.16180/j.cnki.issn1007-7820.2024.04.011

评论