| 注册
首页|期刊导航|大数据|语言模型增强的引文网络连边因子挖掘

语言模型增强的引文网络连边因子挖掘

王慜懋 赵洪科 吴李康 焦之贤 黄振亚

大数据2025,Vol.11Issue(2):91-106,16.
大数据2025,Vol.11Issue(2):91-106,16.DOI:10.11959/j.issn.2096-0271.2025025

语言模型增强的引文网络连边因子挖掘

Language model-enhanced edge factor mining in citation network

王慜懋 1赵洪科 1吴李康 1焦之贤 2黄振亚3

作者信息

  • 1. 天津大学管理与经济学部,天津 300072||天津大学复杂管理系统实验室,天津 300072
  • 2. 北京大学光华管理学院,北京 100871
  • 3. 中国科学技术大学认知智能国家重点实验室,安徽 合肥 230026
  • 折叠

摘要

Abstract

GNN is adept at aggregating information from neighboring nodes in graph-structured data to learn node representations,showing immense potential in the field of citation network data mining.However,most existing GNN lack a deep exploration of the factors driving edge information,which limits a thorough understanding and interpretation of complex relationships between nodes.For instance,the citation relationships between different papers are often driven by a variety of research topics.Despite attempts to enrich node and edge feature representations by integrating LLM with their strong textual understanding capabilities,these approaches have still not effectively sloved the problem of uncovering the underlying drivers of edge information.In light of this,an innovative framework was proposed—language model-enhanced edge factor mining,aimed to enhance the differentiation of edge relationship modeling in various GNN through a plug-in approach,exploring its application value in citation network link prediction scenarios.Coarse-grained factor mining extracted explicit category-related edge factors from citation network graphs containing documents by designing structured information prompts for LLM.Fine-grained factor mining used the K-means clustering algorithm to capture more detailed semantic topic-level edge factors from graph textual data.To verify the effectiveness of the proposed framework,experiments were conducted on three public datasets.The results demonstrate a significant advantage of language model-enhanced edge factor mining framework in improving the performance of various GNN models.

关键词

大语言模型/图神经网络/连边因子挖掘/链接预测/引文网络

Key words

large language mode/graph neural network/edge factor mining/link prediction/citation network

分类

计算机与自动化

引用本文复制引用

王慜懋,赵洪科,吴李康,焦之贤,黄振亚..语言模型增强的引文网络连边因子挖掘[J].大数据,2025,11(2):91-106,16.

基金项目

国家自然科学基金项目(No.72101176) The National Natural Science Foundation of China(No.72101176) (No.72101176)

大数据

2096-0271

访问量0
|
下载量0
段落导航相关论文