| 注册
首页|期刊导航|计算机应用研究|基于多标签CRF的疾病名称抽取

基于多标签CRF的疾病名称抽取

王鹏远 姬东鸿

计算机应用研究2017,Vol.34Issue(1):118-122,5.
计算机应用研究2017,Vol.34Issue(1):118-122,5.DOI:10.3969/j.issn.1001-3695.2017.01.025

基于多标签CRF的疾病名称抽取

Multi-label CRF based method for disease extraction

王鹏远 1姬东鸿1

作者信息

  • 1. 武汉大学 计算机学院,武汉430072
  • 折叠

摘要

Abstract

Named entity recognition in medical text for building and digging large clinical database to serve the clinical deci-sion is of great significance,and one of the important basic work is to be able to accurately identify the name of the disease. There are a large number of compound disease name in the medical texts.In order to solve this problem,this paper proposed a kind of CRF algorithm based on multi-label,first of all,it put multilayer labels to the data,labels on each floor for different diseases,and then integrated into an end label to training model,finally,it isolated each layer label from the model predicts result,and then identified the diseases.This method can recognize composite disease name which cannot be identified by the traditional CRF algorithm.The experimental results verify the effectiveness of the proposed algorithm.

关键词

命名实体识别/条件随机场/多标签/医疗文本/复合实体

Key words

named entity recognition/conditional random fields/multi-label/medical text/composite entity

分类

信息技术与安全科学

引用本文复制引用

王鹏远,姬东鸿..基于多标签CRF的疾病名称抽取[J].计算机应用研究,2017,34(1):118-122,5.

基金项目

国家自然科学基金重点资助项目(61133012);国家哲学社会科学重大计划招标项目(11&ZD189);国家自然科学基金资助项目 ()

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文