北京大学学报(自然科学版)2025,Vol.61Issue(5):860-868,9.DOI:10.13209/j.0479-8023.2024.125
基于跨度表示的藏医药文献实体关系抽取
Entity Relation Extraction Based on Span Representation for Tibetan Medicine Literature
摘要
Abstract
Due to the particularity of Tibetan medical terminology,the scarcity of text resources,and the complexity of language processing,traditional entity relation extraction methods are difficult to directly apply in the field of Tibetan medicine.This paper proposes a method for extracting entity relation from Tibetan medical literature based on span representation.By using span representation and TibetanAI_ALBERT_v2.0 pre-trained language model for encoding,potential candidate entities are enumerated to solve the problem of insufficient recognition of entity nesting.At the same time,KL divergence is introduced to constrain the model inconsistency during the training and inference stages.The experimental results on the entity relation extraction dataset TibetanAI_TMDisRE_v1.0 in the field of Tibetan medicine show that the proposed method achieves significant performance improvement,with accuracy,recall,and F1 values reaching 84.85%,77.35%,and 80.81%,respectively.关键词
藏文/藏医药/实体关系抽取/联合抽取Key words
Tibetan/Tibetan medicine/entity relation extraction/joint extraction引用本文复制引用
周青,拥措,拉毛东只,尼玛扎西..基于跨度表示的藏医药文献实体关系抽取[J].北京大学学报(自然科学版),2025,61(5):860-868,9.基金项目
国家自然科学基金(62566060)、科技创新2030"新一代人工智能"重大项目(2022ZD0116100)、西藏自治区科技厅项目(XZ202401JD0010)和拉萨市科技重点计划专项(LAKJ202526)资助 (62566060)