水利学报2025,Vol.56Issue(9):1143-1154,12.DOI:10.13243/j.cnki.slxb.20240350
基于实时多模态数据的地下洞室施工机械活动识别深度学习模型
Deep learning model for identifying construction machinery activities in underground caverns based on real-time multimodal data
摘要
Abstract
Identification of construction machinery activities is an effective approach to analyzing production effi-ciency and ensuring operational safety.Current methods primarily focus on the characteristics of individual modalities such as kinematics,vision,and acoustics,without adequately considering the intrinsic correlations between multi-modal data.This limitation reduces their effectiveness in environments such as dimly lit,confined,and noisy under-ground chambers.To address this,this study proposes a deep learning model based on the Transformer architecture for real-time multimodal data-based recognition of construction machinery activities in underground chambers.Lever-aging the attention mechanism's capability to capture long-term dependencies across different modalities,the pro-posed model integrates multimodal data to improve recognition performance.Initially,real-time video,audio,and kinematic data are collected during the construction process.The preliminary features of these three modalities are extracted using S3D,VGGish,and Conformer models,respectively.Cross-modal attention and self-attention mecha-nisms are then applied to integrate and extract these preliminary features,generating multimodal fused features.Sub-sequently,the multi-head attention mechanism further combines the preliminary and fused features,enabling robust activity classification based on the enriched feature set.Case studies demonstrate that the proposed model achieves an identification accuracy of 98.14%and an F1 score of 96.47%,representing improvements of 6.38%and 9.13%,respectively,over the best-performing single-modality models.This study provides a novel approach for recognizing construction machinery activities in underground chamber environments.关键词
地下洞室/施工机械活动识别/多模态数据/注意力机制/特征融合Key words
underground caverns/construction machinery activity recognition/multi-modal data/attention mecha-nism/feature fusion分类
建筑与水利引用本文复制引用
佟大威,冯凯悦,余佳,王晓玲..基于实时多模态数据的地下洞室施工机械活动识别深度学习模型[J].水利学报,2025,56(9):1143-1154,12.基金项目
国家自然科学基金项目(U24B20111,52279137,52379132) (U24B20111,52279137,52379132)