| 注册
首页|期刊导航|郑州大学学报(工学版)|基于关键实体和文本摘要多特征融合的话题匹配算法

基于关键实体和文本摘要多特征融合的话题匹配算法

纪科 张秀 马坤 孙润元 陈贞翔 邬俊

郑州大学学报(工学版)2024,Vol.45Issue(2):51-59,9.
郑州大学学报(工学版)2024,Vol.45Issue(2):51-59,9.DOI:10.13705/j.issn.1671-6833.2024.02.008

基于关键实体和文本摘要多特征融合的话题匹配算法

Topic Matching Algorithm Based on Multi-feature Fusion of Key Entities and Text Abstracts

纪科 1张秀 1马坤 1孙润元 1陈贞翔 1邬俊2

作者信息

  • 1. 济南大学 信息科学与工程学院,山东 济南 250022||济南大学 山东省网络环境智能计算技术重点实验室,山东 济南 250022
  • 2. 北京交通大学 计算机与信息技术学院,北京 100044
  • 折叠

摘要

Abstract

With the rapid popularization of the Internet,the amount of Internet news has increased dramatically.In this case,how to effectively find relevant reports that are more in line with a specific topic has become an urgent problem to be solved.To address this issue,a topic matching algorithm based on the fusion of key entities and text abstracts was proposed in this study.Firstly,the W2 NER model was used for named entity recognition to extract key entities using features such as word frequency,TF-IDF,lexical cohesion word-word similarity,and word-sen-tence similarity.Secondly,the Pegasus model was used for text summarization,and the deep semantic features of news texts were obtained by combining the key entity features with the text summary features using BiLSTM.Next,the cross-attention mechanism was employed to enhance the interaction between the matching news articles by per-forming feature interaction.Finally,the deep semantic features of the news texts and the text interaction features were fused together to participate in the determination of text topic matching.Comparative experiments were con-ducted on real data from Sohu,and the results showed that the proposed algorithm achieved similar accuracy and precision compared to other algorithms,while recall and F1 score were improved.

关键词

话题匹配/关键实体/文本摘要/文本匹配/信息检索

Key words

topic matching/key entity/text summary/text matching/information retrieval

分类

信息技术与安全科学

引用本文复制引用

纪科,张秀,马坤,孙润元,陈贞翔,邬俊..基于关键实体和文本摘要多特征融合的话题匹配算法[J].郑州大学学报(工学版),2024,45(2):51-59,9.

基金项目

国家自然科学基金资助项目(61702216,61772231) (61702216,61772231)

山东省重大科技创新工程项目(2021CXGC010103) (2021CXGC010103)

郑州大学学报(工学版)

OA北大核心CSTPCD

1671-6833

访问量0
|
下载量0
段落导航相关论文