基于多模态数据的人体行为识别方法研究综述OA北大核心CSTPCD
Review on Human Action Recognition Methods Based on Multimodal Data
人体行为识别广泛应用于智能安防、自动驾驶和人机交互等领域.随着拍摄设备和传感器技术的发展,可获取用于人体行为识别的数据不再局限于RGB数据,还有深度、骨骼和红外等多模态数据.详细介绍了基于RGB和骨骼数据模态的人体行为识别任务中特征提取方法,包括基于手工标注和基于深度学习的方法.对于RGB数据模态,重点分析了基于双流卷积神经网络、3D卷积神经网络和混合网络的特征提取算法.对于骨骼数据模态,介绍了目前流行的单人和多人姿态评估算法;重点分析了基于卷积神经网络、循环神经网络和图卷积神经网络的分类算法;进一步全面展示了两种数据模态的通用数据集.此外,基于RGB和骨骼各自的数据结构特征,探讨了目前面临的挑战,最后对未来基于深度学习的人体行为识别方法的研究方向进行了展望.
Human action recognition(HAR)is widely applied in the fields of intelligent security,autonomous driving and human-computer interaction.With advances in capture equipment and sensor technology,the data that can be acquired for HAR is no longer limited to RGB data,but also multimodal data such as depth,skeleton,and infrared data.Feature extrac-tion methods in HAR based on RGB and skeleton data modalities are introduced in detail,including handcrafted-based and deep learning-based methods.For RGB data modalities,feature extraction algorithms based on two-stream convolu-tional neural network(2s-CNN),3D convolutional neural network(3DCNN)and hybrid network are analyzed.For skele-ton data modalities,some popular pose estimation algorithms for single and multi-person are firstly introduced.The classi-fication algorithms based on convolutional neural network(CNN),recurrent neural network(RNN),and graph convolu-tional neural network(GCN)are analyzed stressfully.A further comprehensive demonstration of the common datasets for both data modalities is presented.In addition,the current challenges are explored based on the corresponding data struc-ture features of RGB and skeleton.Finally,future research directions for deep learning-based HAR methods are discussed.
王彩玲;闫晶晶;张智栋
西安石油大学 计算机学院,西安 710065
计算机与自动化
视频理解人体行为识别深度学习特征提取姿态评估算法
video understandinghuman action recognitiondeep learningfeature extractionpose estimation algorithms
《计算机工程与应用》 2024 (009)
1-18 / 18
国家自然科学基金(62276213).
评论