| 注册
首页|期刊导航|农业工程学报|面向葡萄知识图谱构建的多特征融合命名实体识别

面向葡萄知识图谱构建的多特征融合命名实体识别

聂啸林 张礼麟 牛当当 吴华瑞 朱华吉 张宏鸣

农业工程学报2024,Vol.40Issue(3):201-210,10.
农业工程学报2024,Vol.40Issue(3):201-210,10.DOI:10.11975/j.issn.1002-6819.202306124

面向葡萄知识图谱构建的多特征融合命名实体识别

Multi-feature fusion named entity recognition method for grape knowledge graph construction

聂啸林 1张礼麟 1牛当当 2吴华瑞 3朱华吉 3张宏鸣2

作者信息

  • 1. 西北农林科技大学信息工程学院,杨凌 712100
  • 2. 西北农林科技大学信息工程学院,杨凌 712100||陕西省农业信息感知与智能服务重点实验室,杨凌 712100
  • 3. 北京市农林科学院信息技术研究中心,北京 100097||国家农业信息化工程技术研究中心,北京 100097
  • 折叠

摘要

Abstract

Domain knowledge graph can store the data with structured and fine-grained features,and model the real world in the form of triple groups.Dispersed knowledge can be effectively organized and then widely used in the fields of healthcare,finance,and the Internet.Alternatively,the grape is one of the most important economic fruits in agriculture.However,there is a large amount of unstructured knowledge in the grape domain,limiting the downstream data-driven task use.Current knowledge graphs are also rare in the agricultural domain.It is very necessary to construct a knowledge graph in the grape domain,particularly for knowledge storage and sharing.Furthermore,the key information is often implicit in the complex contextual environment,when constructing domain knowledge graphs.The character vector semantic representations of existing named entity recognition(NER)models are relatively homogeneous,leading to a low recognition rate of domain-specialized entities,and ultimately affect the efficiency and quality of knowledge graph construction.In this study,a named entity recognition model was proposed using the fusion of Bi-directional Encoder Representation from Transformer(BERT)and Residual Structure(RS).Firstly,the raw text was mapped into the character vectors using BERT.The input sentences were then embedded in BERT using token,segment and position embedding.In the subsequent embedded vectors,a distinctive Multi-head Attention mechanism was utilized to calculate the correlation between the current character and other characters in the sentence.This calculation allowed for the adjustment of their weights,thereby endowing the character vectors provided by BERT with global characteristics.In the Bi-directional Long-Short Term Memory(BiLSTM),the character vectors provided by BERT were obtained from the deep-layered local features in both forward and backward directions.Two simple but effective residual structures were designed to optimize the global features provided by BERT and the deep local feature provided by BiLSTM.The mapping residual structure was used to map the feature vectors provided by the BERT in a reduced dimension,in order to preserve as much of the original information of the BERT as possible,while the convolution residual structure convolved the feature vectors twice to obtain more information.The feature vectors were decoded by a Conditional Random Field(CRF)model.Compared with other NER models,the proposed BBNER-MRS model performs better overall,with F1 values of 89.89%,95.02%,83.21%,96.15%,and 72.51%on the Grape,People Daily,BOSON,RESUME,and Weibo datasets,respectively.A two-stage deep learning-based domain knowledge graph construction was proposed,i.e.,in the first stage,a domain ontology was constructed,and in the second stage,a deep learning model was utilized to extract knowledge under the constraints of the ontology and construct triple groups.The BBNER-MRS performed the best when constructing triple groups from unstructured text with an F1 value of 86.44%.Finally,the BBNER-MRS was used to successfully construct a grape knowledge graph.This research can provide technical and data support to the standardization and sharing of domain data.

关键词

信息化/深度学习/知识图谱/命名实体识别/BERT/残差结构

Key words

informatization/deep learning/knowledge graph/named entity recognition/BERT/residual structure

分类

信息技术与安全科学

引用本文复制引用

聂啸林,张礼麟,牛当当,吴华瑞,朱华吉,张宏鸣..面向葡萄知识图谱构建的多特征融合命名实体识别[J].农业工程学报,2024,40(3):201-210,10.

基金项目

国家重点研发计划项目(2020YFD1100601) (2020YFD1100601)

陕西省重点研发计划项目(2023-YBNY-217 ()

2023-ZDLNY-69) ()

陕西省秦创原"科学家+工程师"队伍建设项目(2022KXJ-67) (2022KXJ-67)

农业工程学报

OA北大核心CSTPCD

1002-6819

访问量0
|
下载量0
段落导航相关论文