中山大学学报(医学科学版)2018,Vol.39Issue(3):455-462,8.
基于CRF与RUTA规则相结合的卒中入院记录医学实体识别及应用
Medical Name Entity Recognition and Application in Chinese Admission Record of Stroke Patients Based on CRF and RUTA rule
摘要
Abstract
[Objective] To research the construction and optimization of natural language processing model for unstructured medical records,and using the model to extract structured data from medical records of stroke patients in Jiangxi Medical Big Data Platform.[Methods] According to the actual needs of clinical research,a stroke specialist entity annotation system and named entity annotation corpus were constructed based on 500 hospital admission records of stroke patients,which randomly selected between 2011 to 2016 from the Jiangxi provincial medical big data platform.The corpus is used to construct a named entity extraction model based on CRF and RUTA rules,and the recognition accuracy is improved by adjusting RUTA rules and parameters.[Results] Accuracy rate of extraction model was 0.960,recall rate was 0.916 and F-score was 0.939.The extraction model was used to extract 264 580 entities and 1 161 077 entity relation from 10 295 stroke patients' admission records of the medical big data platform.[Conclusions] The constructed natural language extraction model has a high recognition accuracy,which can accurately obtain valuable scientific research data of patients' past history,life history and clinical manifestations from a large number of unstructured medical records and effectively improve the clinical research efficiency and scientific research level of cerebrovascular diseases.关键词
中文电子病历/命名实体识别/条件随机场CRF/脑卒中Key words
Chinese medical record/named entity recognition/CRF/stroke分类
信息技术与安全科学引用本文复制引用
许源,葛艳秋,王强,熊刚,易应萍..基于CRF与RUTA规则相结合的卒中入院记录医学实体识别及应用[J].中山大学学报(医学科学版),2018,39(3):455-462,8.基金项目
江西省科技厅科技创新平台(20171BCD40024) (20171BCD40024)
江西省科技厅一般项目(20171BBH80025) (20171BBH80025)