网络与信息安全学报2024,Vol.10Issue(3):107-116,10.DOI:10.11959/j.issn.2096-109x.2024043
用于小样本跨语言文本分类的元对比学习框架
Contrastive meta-learning framework for few-shot cross-lingual text classification
摘要
Abstract
Many security risk control issues,such as public opinion analysis in international scenarios,have been identified as text classification problems,which are challenging due to the involvement of multiple languages.Pre-vious studies have demonstrated that the performance of few-shot text classification tasks can be enhanced through cross-lingual semantic knowledge transfer.However,the advancement of cross-lingual text classification is faced with several challenges.Firstly,it has been found difficult to obtain language-agnostic representations that perform well in cross-lingual transfer.Moreover,the differences in grammatical structure and syntactic rules between differ-ent languages cause variations in text representation,making it difficult to extract general semantic information.Ad-ditionally,the scarcity of labeled data has been identified as a severe constraint on the performance of most existing methods.In many real-world scenarios,only a small amount of labeled data is available,which has been found to severely degrade the performance of many methods.Therefore,effective methods are needed to accurately transfer knowledge in few-shot situations and improve the generalization ability of classification models.To tackle these challenges,a novel framework was proposed that integrates contrastive learning and meta-learning.Within the framework,contrastive learning was utilized to extract general language-agnostic semantic information,while the rapid generalization advantages of meta-learning were leveraged to improve knowledge transfer in few-shot set-tings.Furthermore,a task-based data augmentation method was proposed to further improve the performance of the framework in few-shot cross-lingual classification.Extensive experiments conducted on two widely used multilin-gual text classification datasets show that the proposed method outperforms several strong baselines.This indicates that the method can be effectively applied in the field of risk control and security.关键词
跨语言文本分类/元学习/对比学习/小样本Key words
cross-lingual text classification/meta-learning/contrastive learning/few-shot分类
信息技术与安全科学引用本文复制引用
郭建铭,赵彧然,刘功申..用于小样本跨语言文本分类的元对比学习框架[J].网络与信息安全学报,2024,10(3):107-116,10.基金项目
国家自然科学基金项目(U21B2020) (U21B2020)
上海市科技计划项目(22511104400) The National Natural Science Foundation of China(U21B2020),Shanghai Science and Technology Plan(22511104400) (22511104400)