| 注册
首页|期刊导航|中国电子科技|Enhanced Identifying Gene Names from Biomedical Literature with Conditional Random Fields

Enhanced Identifying Gene Names from Biomedical Literature with Conditional Random Fields

Wei-Zhong Qian Chong Fu Hong-Rong Cheng Qiao Liu Zhi-Guang Qin

中国电子科技2009,Vol.7Issue(3):227-231,5.
中国电子科技2009,Vol.7Issue(3):227-231,5.

Enhanced Identifying Gene Names from Biomedical Literature with Conditional Random Fields

Enhanced Identifying Gene Names from Biomedical Literature with Conditional Random Fields

Wei-Zhong Qian 1Chong Fu 1Hong-Rong Cheng 1Qiao Liu 1Zhi-Guang Qin1

作者信息

  • 1. University of Electronic Science and Technology of China
  • 折叠

摘要

Abstract

Identifying gene names is an attractive research area of biology computing. However, accurate extraction of gene names is a challenging task with the lack of conventions for describing gene names. We devise a systematical architecture and apply the model using conditional random fields (CRFs) for extracting gene names from Medline. In order to improve the performance, biomedical ontology features are inserted into the model and post processing including boundary adjusting and word filter is presented to solve name overlapping problem and remove false positive single words. Pure string match method, baseline CRFs, and CRFs with our methods are applied to human gene names and HIV gene names extraction respectively in 1100 abstracts of Medline and their performances are contrasted. Results show that CRFs are robust for unseen gene names. Furthermore, CRFs with our methods outperforms other methods with precision 0.818 and recall 0.812.

关键词

Conditional random fields/ gene name extraction/ information extraction/ named entity recognition.

Key words

Conditional random fields/ gene name extraction/ information extraction/ named entity recognition.

引用本文复制引用

Wei-Zhong Qian,Chong Fu,Hong-Rong Cheng,Qiao Liu,Zhi-Guang Qin..Enhanced Identifying Gene Names from Biomedical Literature with Conditional Random Fields[J].中国电子科技,2009,7(3):227-231,5.

基金项目

This work was supported by China Scholarship Council under Grant No. 2007104897 and UESTC Youth Foundation under Grant No. JX05007. ()

中国电子科技

1674-862X

访问量5
|
下载量0
段落导航相关论文