| 注册
首页|期刊导航|物理化学学报|基于机器学习方法的丙型肝炎病毒非结构蛋白5B聚合酶抑制剂活性预测

基于机器学习方法的丙型肝炎病毒非结构蛋白5B聚合酶抑制剂活性预测

吕巍 薛英

物理化学学报2011,Vol.27Issue(6):1407-1416,10.
物理化学学报2011,Vol.27Issue(6):1407-1416,10.

基于机器学习方法的丙型肝炎病毒非结构蛋白5B聚合酶抑制剂活性预测

Prediction of Hepatitis C Virus Non-Structural Proteins 5B Polymerase Inhibitors Using Machine Learning Methods

吕巍 1薛英2

作者信息

  • 1. 山东农业大学生命科学学院,作物生物学国家重点实验室,山东泰安271018
  • 2. 四川大学化学学院,教育部绿色化学与技术重点实验室,成都610064
  • 折叠

摘要

Abstract

Non-structural proteins 5B (NS5B) play an important role in protein maturation and gene replication as an RNA dependent RNA polymerase in the hepatitis C virus (HCV). Inhibiting NS5B polymerase will prevent RNA replication and, therefore, it is significant for the treatment of HCV. It is becoming increasingly important to screen and predict molecules that have NS5B inhibitory activity by computational methods. This work explores several machine learning (ML) methods (support vector machine (SVM), k-nearest neighbor (k-NN), and C4.5 decision tree (C4.5 DT)) for the prediction of NS5B inhibitors (NS5BIs). This prediction system was tested using 1248 compounds (552 NS5BIs and 696 non-NS5BIs), which are significantly more diverse in chemical structure than those used in other studies. A feature selection method was used to improve the prediction accuracy and the selection of molecular descriptors responsible for distinguishing between NS5BIs and non-NS5BIs. The prediction accuracies were 81.4%-91.7% for the NS5BIs, 78.2%-87.2% for the non-NS5BIs, and 84.1%-85.0% overall based on the three kinds of machine learning methods. SVM gave the best accuracy of 91.7% for the NS5BIs, C4.5 gave the best accuracy of 87.2% for the non-NS5BIs, and k-NN gave the best overall accuracy of 85.0% for all the compounds. This work suggests that machine learning methods can facilitate the prediction of the NS5BIs potential for unknown sets of compounds and to determine the molecular descriptors associated with NS5BIs.

关键词

机器学习方法/分子描述符/递归变量消除法/支持向量机/丙型肝炎病毒

Key words

Machine learning method/Molecular descriptor/Recursive feature elimination/Support vector machine/Hepatitis C virus

分类

化学化工

引用本文复制引用

吕巍,薛英..基于机器学习方法的丙型肝炎病毒非结构蛋白5B聚合酶抑制剂活性预测[J].物理化学学报,2011,27(6):1407-1416,10.

基金项目

国家重点基础研究发展规划项目(2009CB118500)和教育部留学归国人员科研启动基金(20071108-18-15)The project was supported by the National Key Basic Research Program of China (2009CB118500) and Scientific Research Foundation for the Returned Overseas Chinese Scholars,Ministry of Education,China (20071108-18-15). (2009CB118500)

物理化学学报

OA北大核心CSCDCSTPCDSCI

1000-6818

访问量0
|
下载量0
段落导航相关论文