河南理工大学学报(自然科学版)2025,Vol.44Issue(5):43-51,9.DOI:10.16186/j.cnki.1673-9787.2027070013
基于二阶时空自适应的小样本视频行为识别方法
Few-shot action recognition in video method based on second-order spatiotemporal adaptation
摘要
Abstract
Objectives In the field of few-shot video action recognition,existing methods generally face chal-lenges in adequately processing global spatiotemporal information.These methods typically rely on large amounts of annotated data to train deep models,but with only a limited number of training samples avail-able,they often struggle to effectively capture and utilize the spatiotemporal dynamics in video data.Meth-ods To address this issue,an innovative second-order spatiotemporal adaptive network architecture includ-ing a spatiotemporal adaptive module and a covariance aggregation module was proposed to significantly en-hance the accuracy and robustness of few-shot learning in video action recognition tasks.The spatiotemporal adaptive module dynamically aggregated local and global spatiotemporal information based on changes in video content,thereby optimizing the process of global information extraction.The covariance aggregation module utilized second-order statistical methods to enhance the global spatiotemporal feature representation of videos,providing a more robust global depiction of video content.Results Extensive experiments were conducted on four mainstream video action recognition benchmark datasets.The results demonstrated that the proposed method achieved accuracies of 52.2%and 72.4%for 1-shot and 5-shot tasks on the Something-Something V2 dataset,significantly outperforming the baseline model.Strong performance was also ob-served on Kinetics100,UCF101,and HMDB51 datasets,fully validating its effectiveness and practicality in few-shot video action recognition.Conclusions The proposed second-order spatiotemporal adaptive net-work effectively improved the accuracy and robustness of few-shot video action recognition.It demonstrated significant advantages in processing complex spatiotemporal information.This work provided an innovative and efficient solution addressing critical challenges in spatiotemporal modeling under limited data scenarios.关键词
小样本学习/视频行为识别/时空表征学习/时序建模/协方差聚合Key words
few-shot learning/action recognition in video/spatiotemporal representation learning/temporal modeling/covariance aggregation分类
信息技术与安全科学引用本文复制引用
张冰冰,李海波,马源晨,张建新..基于二阶时空自适应的小样本视频行为识别方法[J].河南理工大学学报(自然科学版),2025,44(5):43-51,9.基金项目
国家自然科学基金资助项目(61972062) (61972062)
吉林省科技发展计划项目(20230201111GX) (20230201111GX)
辽宁省应用基础研究计划项目(2023JH2/101300191,2023JH2/101300193) (2023JH2/101300191,2023JH2/101300193)
先进设计与智能计算省部共建教育部重点实验室开放课题(ADIC2023ZD003) (ADIC2023ZD003)