| 注册
首页|期刊导航|计算机应用与软件|基于Roberta的中文短文本语义相似度计算研究

基于Roberta的中文短文本语义相似度计算研究

张小艳 李薇

计算机应用与软件2024,Vol.41Issue(8):275-281,366,8.
计算机应用与软件2024,Vol.41Issue(8):275-281,366,8.DOI:10.3969/j.issn.1000-386x.2024.08.040

基于Roberta的中文短文本语义相似度计算研究

RESEARCH ON CALCULATION OF SEMANTIC SIMILARITY OF CHINESE SHORT TEXT BASED ON ROBERTA

张小艳 1李薇1

作者信息

  • 1. 西安科技大学计算机科学与技术学院 陕西西安 710600
  • 折叠

摘要

Abstract

Aimed at the problem of insufficient feature extraction ability in the traditional text semantic similarity calculation model based on the Siamese network,a fusion of Siamese networks and Roberta pre-training model SRoberta-SelfAtt is proposed.On the Siamese network architecture,the Roberta(a robustly optimized bert pretraining approach)pre-training model was used to encode the original text pairs into character-level vectors,and the self-attention mechanism was used to capture the associations between different words in the text.The sentence vector of the text pair was obtained through the pooling strategy,and the expression results were interacted and merged.The loss value was calculated in the fully connected layer to evaluate the semantic similarity of the text pair.This model was tested on three data sets under two types of tasks.The results show that the proposed model is improved compared with other models,and provides an effective basis for further research on optimizing the accuracy of text semantic similarity calculation.

关键词

孪生神经网络/Roberta/自注意力机制/中文短文本/语义相似度计算

Key words

Siamese network/Roberta/Self-attention/Chinese short text/Semantic similarity calculation

分类

信息技术与安全科学

引用本文复制引用

张小艳,李薇..基于Roberta的中文短文本语义相似度计算研究[J].计算机应用与软件,2024,41(8):275-281,366,8.

基金项目

国家自然科学基金青年科学基金项目(61702408). (61702408)

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文