网络安全与数据治理2025,Vol.44Issue(5):21-28,8.DOI:10.19358/j.issn.2097-1788.2025.05.004
面向新闻的长文本事件抽取方法
A method for event extraction from lengthy news texts
武剑涛 1李俊达 1李佰文 1淮晓永1
作者信息
- 1. 华北计算机系统工程研究所,北京 100083
- 折叠
摘要
Abstract
Event extraction technology,which aims to identify and structurally represent event information from unstructured text,serves as the foundational infrastructure for constructing knowledge graphs and enabling public opinion analysis.To address the challenges of multi-event coexistence,complex narrative structures in lengthy news texts,and input length constraints of existing models,this paper proposes a hierarchical event extraction framework specifically designed for news narratives.The framework features three key innovations:(1)a semantic boundary segmentation algorithm that optimizes paragraph segmentation to mini-mize cross-paragraph fragmentation of event elements;(2)integration of machine reading comprehension(MRC)technology for localized event element extraction;(3)a cross-chunk event fusion algorithm is designed to achieve semantic integration of distrib-uted event components.Experimental evaluations demonstrate that the proposed framework effectively adapts to the structural char-acteristics of news texts,can consistently extract critical information in multi-event scenarios,and deliver practically viable techni-cal solutions for public opinion monitoring and knowledge graph construction.关键词
事件抽取/机器阅读理解/语义分块Key words
event extraction/machine reading comprehension/semantic chunking分类
计算机与自动化引用本文复制引用
武剑涛,李俊达,李佰文,淮晓永..面向新闻的长文本事件抽取方法[J].网络安全与数据治理,2025,44(5):21-28,8.