计算机工程与应用Issue(10):251-257,7.DOI:10.3778/j.issn.1002-8331.1512-0054
疾病-病症和病症-治疗物质的关系抽取研究
Research of disease-symptom and symptom-therapeutic substance rela-tion extraction
摘要
Abstract
With the rapid growth of biomedical literature, the knowledge about diseases, symptoms and therapeutic sub-stances in biomedical literature has contributed positively to drug discovery and disease therapy. This paper presents the method of constructing two models for extracting the relations between diseases and therapeutic substances, i.e. the dis-ease and symptom model and symptom and therapeutic substance model. The disease and symptom model judges whether a disease can exist or cause a physiological phenomenon. The symptom and therapeutic substance model determines whether a substance changes a person's physiological processes. In this method, a semi-supervised learning algorithm, Tri-training is applied to utilize the unlabeled data along with a few labeled examples to improve the classification perfor-mance. Experimental results show that exploiting unlabeled data with the Tri-training algorithms can enhance the experi-mental result. In the Tri-training process, this method uses ensemble learning to integrate three classifies, which can improve the learning performance.关键词
信息抽取/半监督学习/Tri-training/集成学习Key words
unlabeled data/semi-supervised learning/Tri-training/ensemble learning分类
信息技术与安全科学引用本文复制引用
冯钦林,杨志豪,林鸿飞..疾病-病症和病症-治疗物质的关系抽取研究[J].计算机工程与应用,2017,(10):251-257,7.基金项目
国家自然科学基金(No.61070098,No.61272373,No.61340020) (No.61070098,No.61272373,No.61340020)
新世纪优秀人才支撑计划(No.NCET-13-0084) (No.NCET-13-0084)
中央高校基本科研业务费专项资金(No.DUT13JB09,No.DUT14YQ213). (No.DUT13JB09,No.DUT14YQ213)