| 注册
首页|期刊导航|计算机应用与软件|基于主题模型的通用文本匹配方法

基于主题模型的通用文本匹配方法

黄振业 莫淦清 余可曼

计算机应用与软件2024,Vol.41Issue(5):310-318,349,10.
计算机应用与软件2024,Vol.41Issue(5):310-318,349,10.DOI:10.3969/j.issn.1000-386x.2024.05.045

基于主题模型的通用文本匹配方法

GENERAL TEXT MATCHING BASED ON TOPIC MODEL

黄振业 1莫淦清 1余可曼2

作者信息

  • 1. 浙江金融职业学院信息技术学院 浙江杭州 310018
  • 2. 杭州平治信息技术股份有限公司 浙江杭州 310030
  • 折叠

摘要

Abstract

The similarity measurement between a long text and a short text relatively has more and more application scenarios,and the consistency judgment on these text pairs can be abstracted as a comparison problem of text similarity.The challenge is that the short text is sparse,it is difficult to determine which domain it belongs to and it is also difficult to introduce word embedding to solve the specific text matching problem in general scenarios.Aiming at this problem,this paper proposes a lightweight approach based on topic model with text clustering which can match generalized long-short texts without using extra related background knowledge.The experimental results on two typical test sample datasets show the text similarity detection efficiency of the proposed method is very high.

关键词

自然语言处理/文本匹配/主题模型/吉布斯采样

Key words

Natural language processing/Text matching/Topic model/Gibbs sampling

分类

信息技术与安全科学

引用本文复制引用

黄振业,莫淦清,余可曼..基于主题模型的通用文本匹配方法[J].计算机应用与软件,2024,41(5):310-318,349,10.

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文