基于命名实体识别的《神农本草经》知识图谱构建及可视化分析OACSTPCD
Knowledge Graph Construction and Visualization Analysis of Shen Nong Ben Cao Jing Based on Named Entity Recognition
目的 构建《神农本草经》知识图谱,分析本草知识、挖掘隐性知识并进行可视化展示,为古籍研究提供方法学参考.方法 梳理并表述《神农本草经》文献涉及的知识实体类型和实体间关系,应用BIO序列标注方法生产训练语料数据集,使用自主研发的CNLP文本标注系统进行文本标注,采用BERT模型识别命名实体,基于规则与语义关联设定确定实体间关系,经知识融合后,用Cypher语言导入图数据库Neo4j-community4.4.9进行存储和可视化展示,构建知识图谱.结果《神农本草经》知识图谱包含5 273个节点、11 064个关系,其模式层包含14种实体类、16种关系类型.可通过Cypher语言查询,从中药分类、药性理论、七情配伍、中药应用方面进行知识的可视化展示.结论 本研究构建的知识图谱可直观反映《神农本草经》所载知识及隐性关系,适用于中医药古籍的知识挖掘及直观多维展示.
Objective To construct the knowledge map of Shen Nong Ben Cao Jing;To analyze basic knowledge of materia medica,explore implicit knowledge,and conduct visualization display;To provide methodological references for the study of ancient books.Methods The types of knowledge entities and relationships between entities involved in the Shen Nong Ben Cao Jing were organized and expressed.A training corpus dataset was produced using the BIO sequence labeling method;a self-developed CNLP text labeling system was used for text labeling;the BERT model was used to recognize named entities;the relationships between entities were set based on rules and semantic associations;the data were imported into the Neo4j-community 4.4.9 graph database using Cypher language for storage and visualization display after knowledge fusion;finally a knowledge graph was constructed.Results The knowledge map of Shen Nong Ben Cao Jing included 5 273 nodes and 11 064 relationships.The pattern layer contained 14 entity classes and 16 relationship types.Through Cypher language query,knowledge was visualized from the aspects of TCM classification,medicinal property theory,compatibility of seven emotions and application of TCM.Conclusion The knowledge graph constructed in this study intuitively reflects the knowledge recorded in Shen Nong Ben Cao Jing and the recessive relationship,which is suitable for knowledge mining and intuitive multi-dimensional display of ancient TCM books.
佟琳;张华敏;佟旭;雷蕾;王程;曾子玲;杨洪军
中国中医科学院中医药信息研究所,北京 100700中国中医科学院中医基础理论研究所,北京 100700中国中医科学院中药研究所,北京 100700中国中医科学院,北京 100700
中医学
知识图谱神农本草经本草知识命名实体识别
knowledge graphShen Nong Ben Cao Jingmateria medica knowledgenamed entity recognition
《中国中医药信息杂志》 2024 (008)
37-43 / 7
国家自然科学基金(82105055);中国中医科学院科技创新工程(CI2021A03708);中央级公益性科研院所基本科研业务费专项(ZZ13-YQ-131)
评论