中南民族大学学报(自然科学版)2026,Vol.45Issue(2):180-190,11.DOI:10.20056/j.cnki.ZNMDZK.20250829
多模态智慧课堂主题分割
Multimodal smart classroom topic segmentation
摘要
Abstract
In smart classroom environments,topic segmentation of instructional videos is of great significance for improving teaching analysis and content organization efficiency.However,existing topic segmentation methods face challenges in smart classroom scenarios,including high error rates in Automatic Speech Recognition(ASR)transcription and incoherent classroom content,leading to poor segmentation performance.To address these challenges,a multimodal smart classroom topic segmentation model is proposed.The model firstly uses ASR to obtain speech transcription text,while designing a novel dynamic OCR recognition and similarity detection de-duplication algorithm to extract textual information from video frames,effectively reducing redundant frame processing.Subsequently,ASR text and OCR text are fed into text encoders for encoding,and a multimodal attention mechanism is employed to fuse features from both modalities.Finally,topic boundary prediction is performed through BiLSTM and MLP networks.Experimental results demonstrate that the proposed solution method outperforms baseline methods in terms of accuracy and scalability in the smart classroom domain.Meanwhile,the proposed dynamic OCR recognition and similarity detection de-duplication algorithm reduces the model's processing time,meeting the requirements of real-world applications.关键词
智慧课堂/多模态/主题分割/去重Key words
smart classroom/multimodal/topic segmentation/de-duplication分类
信息技术与安全科学引用本文复制引用
郝玉泽,周斌,胡波..多模态智慧课堂主题分割[J].中南民族大学学报(自然科学版),2026,45(2):180-190,11.基金项目
湖北省技术创新专项基金资助项目(2019ADC071) (2019ADC071)
中央高校基本科研业务费专项资金资助项目(CZY23006) (CZY23006)