| 注册
首页|期刊导航|北京中医药大学学报|基于机器学习的抗纤维化中药化合物筛选研究

基于机器学习的抗纤维化中药化合物筛选研究

王曦廷 李彧 张澜 刘梦 李城 杨秋实 杭晓屹 刘祎

北京中医药大学学报2019,Vol.42Issue(1):30-36,7.
北京中医药大学学报2019,Vol.42Issue(1):30-36,7.DOI:10.3969/j.issn.1006-2157.2019.01.006

基于机器学习的抗纤维化中药化合物筛选研究

Screening anti-fibrosis Chinese medicinal compounds based on machine learning

王曦廷 1李彧 1张澜 1刘梦 1李城 1杨秋实 1杭晓屹 1刘祎1

作者信息

  • 1. 北京中医药大学中医学院,北京 100029
  • 折叠

摘要

Abstract

Objective To establish a new-type virtual screening predictive model of Chinese medicinal compounds with anti-fibrosis effects, and to verify the predictive performance of the model.Methods The dimension reduction and characteristic optimization of molecular fingerprints were implemented by using random forest (RF) algorithm and gradient boosting decision tree (GBDT) algorithm.A hybrid model of characteristic optimization-machine learning was established, and optimized characteristics were input into logistic regression (LR) and machine learning algorithm of artificial neural network (ANN) for training.Precision, recall rate and F1 value were used for reviewing the performances of various model combinations.The virtual screening predictive model of Chinese medicinal compounds with anti-fibrosis effect was determined according to results of model performance reviewing.The predictive results of anti-fibrosis activity of Chinese medicinal compounds were compared between the virtual screening predictive model and molecular docking model for further verifying the predictive efficiency of the virtual screening predictive model.Results The precision of RF model was 0.76, recall rate was 0.75 and F1 value was 0.74 (AUC=0.818).The precision that of GBDT model was 0.76, recall rate was 0.74 and F1 value was 0.72 (AUC=0.829).The precision of ANN model was 0.75, racall rate was 0.75 and F1 value was 0.75 (AUC=0.802) , and that of model of RF+LR was 0.77, recall rate was 0.76 and F1 value was 0.75 (AUC=0.840).The precision of model of RF+LR was 0.74, recall rate was 0.84 and F1 value was 0.79 (AUC=0.850) , and that of model of GBDT+LR was 0.80, recall rate was 0.80 and F1 value was 0.79 (AUC=0.872).The precision of model of GBDT+ANN was 0.73, recall rate was 0.91 and F1 value was 0.81 (AUC=0.837).The results of molecular docking activities of Chinese medicinal compounds including curcumin, glycyrrhizic acid, hydro-xysafflor yellow A, emodine and gypenoside were accordance with the predictive results of the virtual screening predictive model.Conclusion The model based on RF+LR is better than the models established based on other methods.The virtual screening predictive model has good performance in prediction of Chinese medicinal compounds through comparing with molecular docking model.The method has feature of highthroughput screening and can make up the shortage of compound screening efficiency in molecular docking.It provides a new way for virtual screening prediction of Chinese medicinal compounds with anti-fibrosis effects.

关键词

器官纤维化/机器学习/分子指纹/中药化合物筛选

Key words

organ fibrosis/machine learning/molecular fingerprinting/Chinese medicinal compound screening

分类

医药卫生

引用本文复制引用

王曦廷,李彧,张澜,刘梦,李城,杨秋实,杭晓屹,刘祎..基于机器学习的抗纤维化中药化合物筛选研究[J].北京中医药大学学报,2019,42(1):30-36,7.

基金项目

国家自然科学基金面上资助项目(No.81573716) (No.81573716)

北京中医药大学学报

OA北大核心CSCDCSTPCD

1006-2157

访问量6
|
下载量0
段落导航相关论文