计算机工程与应用2024,Vol.60Issue(17):129-138,10.DOI:10.3778/j.issn.1002-8331.2306-0190
掩码语言增强表示的对比学习微调和应用
Fine-Tuning via Mask Language Model Enhanced Representations Based Contrastive Learning and Application
摘要
Abstract
Autoattention networks play an important role in a language model based on Transformer,where a fully con-nected structure can capture non-continuous dependencies in a sequence in parallel.However,the fully connected self-attention network is easy to overfit to false association information,such as false association between words and words,and between words and the predicted target.This overfitting problem limits the ability of language models to generalize data outside the domain or the distribution.To improve the robustness and generalization ability of the Transformer lan-guage model against false associations,fine-tuning framework via mask language model enhanced representations based contrastive learning is proposed in this paper.Specifically,the text sequence and the sequence after its random mask are sent into a twin network,and then the model parameters are learned by combining the contrast learning objective and the downstream task objective.Each twin network consists of a pre-trained language model and a task classifier.Therefore,the fine-tuning framework is more consistent with the mask language model pre-training learning mode and can maintain the generalization ability of pre-training knowledge in downstream tasks.The MNLI,FEVER,and QQP datasets and their challenge datasets are compared with the latest baseline models,including large language models ChatGPT,GPT4,and LLaMA.Experimental results show that the proposed model can guarantee the performance in distribution and improve the performance out of distribution.The experimental results on ATIS and Snips data sets prove that the model is also effective in common natural language processing tasks.关键词
Transformer/掩码语言模型/对比学习/微调/虚假关联/泛化能力Key words
Transformer/masked language model/contrast learning/fine-tuning/spurious association/generalization ability分类
信息技术与安全科学引用本文复制引用
张德驰,万卫兵..掩码语言增强表示的对比学习微调和应用[J].计算机工程与应用,2024,60(17):129-138,10.基金项目
科技创新2030-"新一代人工智能"重大项目(2020AAA0109300). (2020AAA0109300)