计算机工程与应用2025,Vol.61Issue(6):199-209,11.DOI:10.3778/j.issn.1002-8331.2310-0407
多尺度和多层级特征融合的人体姿态估计
Human Pose Estimation with Multi-Scale and Multi-Level Feature Fusion
摘要
Abstract
The accuracy improvement of human pose estimation usually depends on feature fusion.However,the existing feature fusion strategies often ignore the interaction between scale features and level features.The fusion of single mode may result in less significant feature expression.To make full use of the complementarity between different features,a new multi-scale and multi-level feature fusion network(MSLNet)is proposed.The high-resolution network(HRNet)is used as the backbone to exchange information between feature maps of different resolutions through cross-scale informa-tion exchange,and to obtain both fine-grained and coarse-grained pose features.The expectation maximization attention bidirectional feature pyramid network(EMA-BiFPN)is introduced to achieve multi-level feature aggregation after multi-scale feature fusion.The details and correlation information of human pose are captured from local to global.A keypoint detection head composed of residual structure is designed to complete the final fusion of output features and improve the accuracy of human keypoint detection.The experimental results show that MSLNet achieves the best accuracy of 75.8%and 91.1%on COCO and MPII datasets,respectively.It is fully verified that MSLNet can make use of the complementarity between scale features and level features to improve the accuracy of human pose estimation.关键词
高分辨率网络(HRNet)/人体姿态估计/期望最大化注意力/双向特征金字塔网络/特征融合Key words
high-resolution network(HRNet)/human pose estimation/expectation maximization attention/bidirectional feature pyramid network/feature fusion分类
信息技术与安全科学引用本文复制引用
王燕妮,胡敏,韩世鹏,陈艺瑄,吕昊..多尺度和多层级特征融合的人体姿态估计[J].计算机工程与应用,2025,61(6):199-209,11.基金项目
国家自然科学基金(61803294) (61803294)
陕西省自然科学基础研究项目(2020JM499,2020JQ684). (2020JM499,2020JQ684)