中国农业大学学报2025,Vol.30Issue(5):214-229,16.DOI:10.11841/j.issn.1007-4333.2025.05.18
面向农药登记信息的实体关系联合抽取研究
Research on joint extraction of entities and relationships for pesticide registration information
摘要
Abstract
In response to the characteristics of pesticide registration texts,such as information density,complex logical structure,large span between entities,and heterogeneous entity lengths,as well as to address the issues of triplet overlap,exposure bias,and redundant computation encountered in traditional joint extraction methods,a multi-feature fusion single-stage entity and relation joint extraction model(MF-SERel)is proposed in this study.Firstly,at the encoding layer,the model's ability to represent complex corpora is enhanced through integrating semantic and syntactic features and enriching the meaning of character symble.Secondly,at the multi-dimensional labeling framework layer,an HT-BES multi-dimensional labeling strategy is proposed to resolve the triplet overlap issue.Through parallel scoring functions and fine-grained classification components,the entity-relation joint extraction is transformed into a multi-label labeling task based on the relation dimension.This process does not involve interdependent steps,thereby enabling single-stage parallel labeling,avoiding exposure bias and reducing computational redundancy.Finally,at the decoding layer,labels are predictied based on fine-grained classification,and the entity-relation triplets are decoded.The proposed model is compared with the baseline models such as GraphRel,CasRel,and TPLinker,and evaluated on the pesticide registration dataset(PRD)and the public dataset of unstructured information extraction(DuIE).The results show that the MF-SERel model performs well on the PRD and DuIE.On the PRD,the proposed MF-SERel model improves inference speed by 20%and F1 score by 2.3%,demonstrating its strong knowledge-mining capability in pesticide registration texts.On the DuIE,the MF-SERel model improves inference speed by 54%and F1 score by 1.7%,also achieves better resultsproves its strong generalization ability.In conclusion,the MF-SERel model proposed in this study provides a new method for the structured extraction of knowledge in the pesticide field.关键词
农药/实体关系抽取/标注策略/依存句法分析/图神经网络Key words
pesticides/entity relationship extraction/annotation policies/dependency syntactic analysis/graph neural network分类
信息技术与安全科学引用本文复制引用
车银超,董晨阳,熊淑萍,王顺,马新明,席磊..面向农药登记信息的实体关系联合抽取研究[J].中国农业大学学报,2025,30(5):214-229,16.基金项目
河南省重点研发专项项目(241111111500) (241111111500)
国家自然科学基金项目(32372239) (32372239)