基于多维动态拓扑学习图卷积的骨架动作识别OA北大核心CSTPCD
Multi-Dimensional Dynamic Topology Learning Graph Convolution for Skeleton-Based Action Recognition
图卷积由于其对图数据的强大表示能力被广泛应用于基于骨架的动作识别任务中.但是现有的图卷积方法在所有帧或通道上都使用共享的图拓扑进行特征聚合,这极大限制了图卷积网络的表示能力.为了解决这些问题,本文提出多维动态拓扑学习图卷积用于动态建模具有时序与通道特异性的拓扑结构.多维动态拓扑学习图卷积主要包含三个组成部分:纯粹节点拓扑学习图卷积(pure Joint topology learning Graph Convolution,J-GC)、动态时序特异性拓扑学习图卷积(Dynamic Temporal-Wise topology learning Graph Convolution,DTW-GC)和通道特异性拓扑学习图卷积(Channel-Wise topology learning Graph Convolution,CW-GC).特别地,在DTW-GC中使用了动态骨架拓扑建模方法(Dynamic Skeleton Topology Learning,DSTL),以高效地建模富含全局时空拓扑特征的动态骨架拓扑.将多维动态拓扑学习图卷积与多尺度时间卷积(Multi-Scale Temporal Convolution,MS-TC)相结合,本文构建了具有强大建模能力的图卷积网络.此外,为了对骨架数据的空间信息进行补充,本文额外引入了相对节点数据和相对骨骼数据进行多流网络的融合.本文所提出的方法在NTU-RGB+D与NTU-RGB+D 120数据集上分别取得了92.64%和89.29%的准确率,超过了当前最先进方法.
Graph convolution is widely used in skeleton-based action recognition because of its effectiveness of pro-cessing graph data.However,the existing graph convolution methods use the shared graph topology for feature aggregation on all frames or channels,which greatly limits the representation ability of graph convolution network.In order to solve these problems,a multi-dimensional dynamic topology learning graph convolution is proposed in this paper to dynamically model the topology with temporal and channel specificity.The multi-dimensional dynamic topology learning graph convolu-tion mainly includes three parts:pure joint topology learning graph convolution(J-GC),dynamic temporal-wise topology learning graph convolution(DTW-GC)and channel-wise topology learning graph convolution(CW-GC).In particular,in DTW-GC,a dynamic skeleton topology modeling method(DSTL)is designed to efficiently model the dynamic skeleton to-pology with rich global spatio-temporal topological features.Finally,by combining multi-dimensional dynamic topology learning graph convolution with multi-scale temporal convolution(Muti-Scale TCN),a graph convolution network with powerful modeling capability is constructed in this paper.In addition,in order to supplement the spatial information of skel-eton data,the relative joint data and relative bone data are introduced for multi-stream network fusion.Our method achieves 92.64%and 89.29%accuracy on NTU-RGB+D and NTU-RGB+D 120 datasets,respectively,which is superior to the cur-rent state-of-the-art methods.
罗会兰;曹立京
江西理工大学信息工程学院,江西赣州 341000
计算机与自动化
动作识别深度学习图卷积动态骨架拓扑数据融合
action recognitiondeep learninggraph convolutiondynamic skeleton topologydata fusion
《电子学报》 2024 (003)
991-1001 / 11
国家自然科学基金(No.61862031);江西省主要学科技术带头人领军人才计划资助项目(No.20213BCJ22004);江西省学位与研究生教育教学改革研究重点项目(No.JXYJG-2020-120) National Natural Science Foundation of China(No.61862031);The Project Supported by the Leading Talents Plan for the Technical Leaders of Major Disciplines in Jiangxi Province(No.20213BCJ22004);Jiangxi Province Degree and Postgraduate Education and Teaching Reform Research Key Project(No.JXYJG-2020-120)
评论