| 注册
首页|期刊导航|工程科学学报|基于自校准机制的时空采样图卷积行为识别模型

基于自校准机制的时空采样图卷积行为识别模型

曹毅 吴伟官 张小勇 夏宇 高清源

工程科学学报2024,Vol.46Issue(3):480-490,11.
工程科学学报2024,Vol.46Issue(3):480-490,11.DOI:10.13374/j.issn2095-9389.2022.12.25.002

基于自校准机制的时空采样图卷积行为识别模型

Action recognition model based on the spatiotemporal sampling graph convolutional network and self-calibration mechanism

曹毅 1吴伟官 1张小勇 1夏宇 1高清源1

作者信息

  • 1. 江南大学机械工程学院,无锡 214122||江南大学江苏省食品制造装备重点实验室,无锡 214122
  • 折叠

摘要

Abstract

A skeleton-based action recognition model is proposed for the spatiotemporal sampling graph convolutional network(ST-GCN)based on the self-calibration mechanism to address the problem of existing action recognition algorithms disregarding the dependence of spatiotemporal information context and lacking multilevel receptive fields for feature extraction.First,this paper introduces the working principles of ST-GCN and 3D-GCN,Transformer,and self-attention mechanism and analyzes whether 3D-GCN and Transformer cannot effectively model the global and local spatiotemporal contexts,respectively.Second,a spatiotemporal sampling graph convolutional network is proposed to effectively perform the spatiotemporal context modeling.This network divides the global action into multiple subactions by employing a series of continuous temporal multiframes as spatiotemporal sampling,establishes the local crosstemporal dependency by computing the correlation between a single node and all nodes in the sampling frequency frame with the nonlocal network,and establishes the global crosstemporal dependency by combining the nonlocal network and temporal convolution to compute the correlation between a single sampling subaction and global subactions.Subsequently,to effectively improve the multilevel receptive field for capturing more discriminating temporal features,a temporal self-calibrating convolutional network is proposed for convoluting in two different scales of space-time.Further,two abovementioned features can be combined:one is the space-time of the original scale,while the other is the potential space-time with a smaller scale using downsampling operation;here,the latter adaptively establishes the dependence between the remote space-time and channel and models the interchannel dependence by differentiating the characteristics of each channel.Meanwhile,the spatiotemporal sampling graph convolutional and temporal self-calibration networks are combined to construct the spatiotemporal-sampling graph convolutional network based on self-calibration mechanism,and end-to-end training is performed on this model using the multistream network.Finally,to confirm the effectiveness and superior performance of the model feature extraction,some experimental work is performed on the skeleton-based action recognition based on the NTU-RGB+D and NTU-RGB+D120 skeleton-based action datasets,and the findings reveal that the recognition accuracy under X-View and X-Sub of the NTU-RGB+D dataset reaches up to 95.2%and 88.8%,respectively,confirming the generalization ability of the model on the NTU-RGB+D120 dataset.This work displays that the model has excellent recognition accuracy and generalization ability and corroborates the effective spatiotemporal feature extraction ability and excellent performance of the action recognition model.

关键词

行为识别/时空采样图卷积/时空上下文/时域自校准/多流网络

Key words

action recognition/spatiotemporal sampling graph convolutional network/spatiotemporal context/self-calibration mechanism/mutilstream network

分类

信息技术与安全科学

引用本文复制引用

曹毅,吴伟官,张小勇,夏宇,高清源..基于自校准机制的时空采样图卷积行为识别模型[J].工程科学学报,2024,46(3):480-490,11.

基金项目

江苏省"六大人才高峰"计划(ZBZZ-012) (ZBZZ-012)

江苏省优秀科技创新团队基金资助项目(2019SK07) (2019SK07)

高等学校学科创新引智计划(B18027) (B18027)

工程科学学报

OA北大核心CSTPCD

2095-9389

访问量0
|
下载量0
段落导航相关论文