| 注册
首页|期刊导航|计算机工程与应用|掩码语言增强表示的对比学习微调和应用

掩码语言增强表示的对比学习微调和应用

张德驰 万卫兵

计算机工程与应用2024,Vol.60Issue(17):129-138,10.
计算机工程与应用2024,Vol.60Issue(17):129-138,10.DOI:10.3778/j.issn.1002-8331.2306-0190

掩码语言增强表示的对比学习微调和应用

Fine-Tuning via Mask Language Model Enhanced Representations Based Contrastive Learning and Application

张德驰 1万卫兵1

作者信息

  • 1. 上海工程技术大学,电子电气工程学院,上海 200000
  • 折叠

摘要

Abstract

Autoattention networks play an important role in a language model based on Transformer,where a fully con-nected structure can capture non-continuous dependencies in a sequence in parallel.However,the fully connected self-attention network is easy to overfit to false association information,such as false association between words and words,and between words and the predicted target.This overfitting problem limits the ability of language models to generalize data outside the domain or the distribution.To improve the robustness and generalization ability of the Transformer lan-guage model against false associations,fine-tuning framework via mask language model enhanced representations based contrastive learning is proposed in this paper.Specifically,the text sequence and the sequence after its random mask are sent into a twin network,and then the model parameters are learned by combining the contrast learning objective and the downstream task objective.Each twin network consists of a pre-trained language model and a task classifier.Therefore,the fine-tuning framework is more consistent with the mask language model pre-training learning mode and can maintain the generalization ability of pre-training knowledge in downstream tasks.The MNLI,FEVER,and QQP datasets and their challenge datasets are compared with the latest baseline models,including large language models ChatGPT,GPT4,and LLaMA.Experimental results show that the proposed model can guarantee the performance in distribution and improve the performance out of distribution.The experimental results on ATIS and Snips data sets prove that the model is also effective in common natural language processing tasks.

关键词

Transformer/掩码语言模型/对比学习/微调/虚假关联/泛化能力

Key words

Transformer/masked language model/contrast learning/fine-tuning/spurious association/generalization ability

分类

信息技术与安全科学

引用本文复制引用

张德驰,万卫兵..掩码语言增强表示的对比学习微调和应用[J].计算机工程与应用,2024,60(17):129-138,10.

基金项目

科技创新2030-"新一代人工智能"重大项目(2020AAA0109300). (2020AAA0109300)

计算机工程与应用

OA北大核心CSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文