太原理工大学学报2025,Vol.56Issue(3):495-505,11.DOI:10.16355/j.tyut.1007-9432.20230692
基于双流特征增强网络的视频行为识别
Video Action Recognition Based on Two-stream Feature Enhancement Network
摘要
Abstract
[Purposes]Two-stream convolutional networks primarily achieve high recognition ac-curacy by fusing spatial and temporal features of videos.Traditional two-stream convolutional net-works extract temporal features by using optical flow or temporal shift modules.The disadvantage of using optical flow lies in the heavy workload,while the disadvantage of using temporal shift modules is that they may damage the original videos'spatial and channel features.[Methods]In this paper,a two-stream network called the Two-stream Feature Enhancement Network(TFEN)was proposed to address these issues.To solve the problem of feature damage caused by temporal shift,a Spatial Enhancement-Temporal Shift Module(SE-TSM)and a Channel Enhancement-Temporal Shift Mod-ule(CE-TSM)were proposed to enhance features after each time shift,for improving damaged fea-tures.For the issue with optical flow,a fusion-based frame difference network was proposed for ex-tracting temporal features,which was less time-consuming and easier to operate when compared with optical flow.To address the weak motion information in frame differences,a Sports Improvement Module(SIM)was introduced to enhance motion features and improve performance.[Results]The network in this work achieves 96.1%and 75.7%accuracy on the public video datasets UCF101 and HMDB51,respectively,which are superior to current mainstream networks.关键词
行为识别/双流网络/时间位移/光流/帧差Key words
action recognition/two-stream networks/temporal shift/optical flow/frame difference分类
信息技术与安全科学引用本文复制引用
赵晨,冯秀芳,董云云,温昕,曹若琛..基于双流特征增强网络的视频行为识别[J].太原理工大学学报,2025,56(3):495-505,11.基金项目
山西省重点研发计划项目(202102020101007) (202102020101007)
国家自然科学基金资助项目(6226196) (6226196)
山西省基础研究计划项目(202203021212207) (202203021212207)