通信学报2025,Vol.46Issue(6):136-152,17.DOI:10.11959/j.issn.1000-436x.2025110
基于跨模态融合与双曲图注意力机制的视频异常检测
Video anomaly detection via cross-modal fusion and hyperbolic graph attention mechanism
摘要
Abstract
To address the challenges of modality information imbalance,non-uniform audiovisual noise,and modality asynchrony in video anomaly detection,a multimodal video anomaly detection method called CM-HVAD was proposed for accurate anomaly detection.Firstly,a novel dynamic cross-modal fusion module was introduced to dynamically com-press and reweight multimodal features through autonomous learning of cross-modal weights,thereby achieving bal-anced and enhanced fusion of visual and audio features.Secondly,to address the issue of modal asynchrony in multi-modal data,a modal consistency alignment module was proposed,which aligned modal semantics along the temporal frame sequence to ensure both temporal and semantic consistency in multimodal data.Finally,a hyperbolic graph atten-tion mechanism was incorporated to effectively capture the hierarchical relationships between normal and abnormal rep-resentations through the pattern separation property of hyperbolic space,thereby improving detection accuracy.The re-sults show that the proposed method achieves 86.47%AP on XD-Violence and 87.12%AUC on UCF-Crime,outperform-ing baseline methods.关键词
视频异常检测/跨模态融合/双曲图注意力机制/多模态Key words
video anomaly detection/cross-modal fusion/hyperbolic graph attention mechanism/multi-modal分类
信息技术与安全科学引用本文复制引用
姜迪,赖惠成,汪烈军..基于跨模态融合与双曲图注意力机制的视频异常检测[J].通信学报,2025,46(6):136-152,17.基金项目
国家自然科学基金联合基金资助项目(No.U1903213) (No.U1903213)
新疆维吾尔自治区重点研发计划基金资助项目(No.2022B01008) The National Natural Science Foundation of China Joint Fund Project(No.U1903213),The Key Research and Development Program of Xinjiang Uygur Autonomous Region(No.2022B01008) (No.2022B01008)