智能系统学报2026,Vol.21Issue(1):109-119,11.DOI:10.11992/tis.202505007
基于视觉-语言关键线索挖掘的多模态假新闻检测模型
Visual-language key clue discovery-based multimodal fake news detection model
摘要
Abstract
Multimodal fake news detection aims to enhance the reliability of authenticity assessment by integrating di-verse modalities such as text,images,videos,and audio.However,existing models often overlook discriminative local details and struggle to capture the critical inconsistencies between textual and visual content.To address these chal-lenges,this study proposes a novel multimodal fake news detection model,termed the visual-language key clue discov-ery-based multimodal fake news detection model(VKC-MFND),which is designed to discover key visual-linguistic cues.The model comprises three main components:a multi-scale feature extraction module,a key feature information extraction module,and a multi-scale feature alignment module.Specifically,the multi-scale feature extraction module captures both global features(sentence-level or description-level)and local features(word-level or object box-level)from text and images,thereby enriching the diversity of information representation.The key feature information extrac-tion module utilizes attention-based interactions among fine-grained features to uncover discriminative clues and aligns them with global semantic representations,facilitating the fusion of critical cross-modal information.Meanwhile,the multi-scale feature alignment module optimizes the model using both classification and alignment losses,enhancing se-mantic consistency in the shared feature space.Extensive experiments conducted on three benchmark multimodal fake news datasets-Weibo,Weibo-19,and Pheme-demonstrate that the proposed model significantly outperforms state-of-the-art approaches.Further ablation studies confirm the effectiveness and necessity of each component in the model.关键词
多模态虚假新闻检测/多尺度特征交互/关键线索发现/细尺度表示/跨模态注意力/全局特征对齐/记忆增强机制/语义不一致检测Key words
multimodal fake news detection/multi-scale feature interaction/key clue discovery/fine-grained representa-tion/cross-modal attention/global feature alignment/memory-enhanced mechanism/semantic inconsistency detection分类
信息技术与安全科学引用本文复制引用
孟想,王博岳,高祎菡,吴广超,刘易昆,吕松澄,尹宝才..基于视觉-语言关键线索挖掘的多模态假新闻检测模型[J].智能系统学报,2026,21(1):109-119,11.基金项目
国家自然科学基金项目(92370102). (92370102)