| 注册
首页|期刊导航|计算机技术与发展|基于关键词的蛋白质交互关系识别

基于关键词的蛋白质交互关系识别

毛宇薇 牛耘

计算机技术与发展2019,Vol.29Issue(3):18-22,5.
计算机技术与发展2019,Vol.29Issue(3):18-22,5.DOI:10.3969/j.issn.1673-629X.2019.03.004

基于关键词的蛋白质交互关系识别

Protein-protein Interaction Identification Based on Keywords

毛宇薇 1牛耘1

作者信息

  • 1. 南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106
  • 折叠

摘要

Abstract

Protein-protein interaction is one of the important research areas in the field of biomedicine. The relevant PPI information currently available through biomedical experiments is mainly stored in texts in the relevant literature. With the rapid growth of biomedical literature, the way of manually identifying PPI has been difficult to meet the needs of practical applications. In this paper, we adopt a weak supervision based PPI recognition infrastructure. With a small number of pairs of proteins as an interactive set of seeds, PPI is eventually identified through continuous iteration expansion of the seed set. Compared with other existing methods, this method only needs a small amount of labeled data to achieve great recognition results, which saves a lot of manpower and resources. On this basis, we use the word embedding to expand the existing key words that express PPI and score the reliability of the keywords. According to the expanded set of keywords, the clustering process of the basic framework is improved, and the set of input lexical patterns of clustering is sorted in descending order according to the included keyword scores. The experiment shows that the basic PPI recognition framework achieves better results with only a small amount of labeled data. On this basis, the improved keyword expansion algorithm further improves the results. The highest F-score after the first iteration is 67.20%, 1.54% higher than that before the improvement, and the F-score after three iterations is 69.05%.

关键词

蛋白质交互关系/弱监督/分布式假设/词向量/关键词

Key words

protein-protein interaction/w eak supervision/distributional hypothesis/word embedding/keywords

分类

信息技术与安全科学

引用本文复制引用

毛宇薇,牛耘..基于关键词的蛋白质交互关系识别[J].计算机技术与发展,2019,29(3):18-22,5.

基金项目

国家自然科学基金(61202132) (61202132)

计算机技术与发展

OACSTPCD

1673-629X

访问量0
|
下载量0
段落导航相关论文