| 注册
首页|期刊导航|计算机工程与应用|引入单模态监督对比学习的多视图讽刺检测

引入单模态监督对比学习的多视图讽刺检测

张政 刘金硕 邓娟 王丽娜

计算机工程与应用2025,Vol.61Issue(19):118-126,9.
计算机工程与应用2025,Vol.61Issue(19):118-126,9.DOI:10.3778/j.issn.1002-8331.2407-0007

引入单模态监督对比学习的多视图讽刺检测

Multi-View Sarcasm Detection with Uni-Modal Supervised Contrastive Learning

张政 1刘金硕 1邓娟 1王丽娜1

作者信息

  • 1. 武汉大学 国家网络安全学院 空天信息安全与可信计算教育部重点实验室,武汉 430072
  • 折叠

摘要

Abstract

The rapid growth of image and text data on social media has led to an increasing interest in the problem of multi-modal sarcasm detection.However,existing detection methods based on feature fusion have some shortcomings:firstly,most methods lack the necessary underlying modality alignment capability for multimodal detection;secondly,the pro-cess of modality fusion overlooks the dynamic relationships between modalities;and thirdly,they fail to fully exploit modality complementarity.To address these issues,a detection model based on uni-modal supervised contrastive learning,multimodal fusion,and multi-view aggregation prediction is proposed.Firstly,the CLIP(contrastive language-image pre-training)model is used as an encoder to enhance the alignment of image and text encodings.Secondly,by incorporating uni-modal supervised contrastive learning,the dynamic relationships between modalities are guided by uni-modal predic-tions.Next,a global-local cross-modal fusion method is designed,utilizing the semantic-level representations of each modality as global multimodal context to interact with local uni-modality features.This is achieved through multiple cross-modal fusion layers to enhance the fusion effect,reducing the time and space costs of previous local-local cross-modal fusion methods.Finally,a multi-view aggregation prediction method is employed to fully leverage the complemen-tarity of image,text,and image-text views.In conclusion,this model effectively captures the cross-modal semantic incon-sistencies in multimodal sarcasm data and outperforms the existing best method,DMSD-Cl,on the public dataset MSD.

关键词

讽刺检测/多模态/对比学习/跨模态融合

Key words

sarcasm detection/multimodal/contrastive learning/cross-modal fusion

分类

信息技术与安全科学

引用本文复制引用

张政,刘金硕,邓娟,王丽娜..引入单模态监督对比学习的多视图讽刺检测[J].计算机工程与应用,2025,61(19):118-126,9.

基金项目

国家自然科学基金(U193607) (U193607)

国家重点研发计划(2020YFA0607902). (2020YFA0607902)

计算机工程与应用

OA北大核心

1002-8331

访问量0
|
下载量0
段落导航相关论文