|国家科技期刊平台
首页|期刊导航|山西大学学报(自然科学版)|融入事件知识的新闻事件对比聚类方法

融入事件知识的新闻事件对比聚类方法OA北大核心CSTPCD

A Contrastive Clustering Method of News Events Incorporating Event Knowledge

中文摘要英文摘要

新闻事件聚类旨在从海量新闻文本中挖掘若干个不同主题的事件簇.目前事件聚类大多基于文本语义表征,忽略了事件知识的指导作用,且由于表征学习和目标聚类的迭代进行,不仅容易造成误差累积,还只能处理离线任务,限制了对实时新闻数据的处理.针对上述问题,该文提出一种融入事件知识的新闻事件对比聚类方法,该方法在文本表征的基础上,融入事件关键信息丰富事件表征;将聚类标签作为表示,同时在实例级和簇群级进行对比学习;以端到端的方式联合学习表示和簇群分配,实现对数据流的聚类.实验结果表明,该方法相较于其他基线模型,提高了3%.

News event clustering aims to mine several event clusters of different topics from massive news texts.At present,event clustering is mostly based on text semantic representation,but ignoring the guiding role of event knowledge.Due to the iterative pro-cess of representation learning and target clustering,it is easy to cause error accumulation.It can only deal with offline tasks,which limits the processing of real-time news data,and to solve the above problems,this paper proposes a contrastive clustering method of news events incorporating event knowledge.On the basis of text representation,this method incorporates event key information to enrich event representation.The cluster label is used as the representation,and contrastive learning is performed at the instance level and the cluster level.The representation and cluster assignment are jointly learned in an end-to-end manner to realize the clustering of data streams.Experimental results show that the proposed method improves by 3%compared with other baseline models.

梁晨;余正涛;高盛祥;朱恩昌

昆明理工大学 信息工程与自动化学院,云南 昆明 650500||云南省人工智能重点实验室,云南 昆明 650500

计算机与自动化

事件聚类事件表征对比学习深度聚类

event clusteringevent representationcontrastive learningdeep clustering

《山西大学学报(自然科学版)》 2024 (004)

727-736 / 10

国家自然科学基金(61972186;61732005;U21B2027);云南高新技术产业发展项目(201606);云南省重大科技专项计划(202103AA080015;202002AD080001-5);云南省基础研究计划(202001AS070014);云南省学术和技术带头人后备人才(202105AC160018)

10.13451/j.sxu.ns.2023065

评论