| 注册
首页|期刊导航|生物信息学|机器学习结合生物信息学鉴定多发性硬化症的关键基因

机器学习结合生物信息学鉴定多发性硬化症的关键基因

黄新蒙 苏凌昊 杨一帆 乔文慧 何家霖 赵培源 刘喜红

生物信息学2026,Vol.24Issue(1):44-56,13.
生物信息学2026,Vol.24Issue(1):44-56,13.DOI:10.12113/202409001

机器学习结合生物信息学鉴定多发性硬化症的关键基因

Machine learning combined with bioinformatics identifies key genes in multiple sclerosis

黄新蒙 1苏凌昊 2杨一帆 2乔文慧 2何家霖 2赵培源 3刘喜红3

作者信息

  • 1. 河南中医药大学 第二临床医学院,郑州 450046
  • 2. 河南中医药大学 第一临床医学院(中西医结合学院),郑州 450046
  • 3. 河南中医药大学 中医学院(仲景学院),郑州 450046
  • 折叠

摘要

Abstract

In order to explore the key genes of multiple sclerosis(MS)based on bioinformatics and machine learning methods;MS gene expression profiles GSE21942 and GSE32988 were obtained from the GEO database.GSE32988 was used as a validation dataset,Sample clustering was assessed using PCA to screen for differentially expressed genes(DEGs)and analyzed for GO and KEGG enrichment,the gene modules closely related to MS were identified using weighted gene co-expression network analysis(WGCNA),the intersection of the gene modules and DEGs was analyzed to obtain candidate genes.Candidate genes were screened to obtain potential key genes using machine learning algorithms,which include the Least Absolute Shrinkage Operator Algorithm(LASSO)and the Random Forest Algorithm(RF).The third-party dataset GSE32988 was used to validate the differential expression of potential key genes.key genes were obtained by performing subject operating characteristic curve(ROC)validation.An MS animal model was used to verify the expression levels of key genes;The results show that GSE21942 showed good repeatability and correlation,and a total of 506 DEGs were obtained.Enrichment analysis showed that DEGs were mainly enriched in biological functions such as B cell activation,glutamic acid(GLU)metabolism,oxidative stress(OS),as well as in EBV infection and the B cell receptor signaling pathway,etc.The 29 candidate genes were screened by a machine learning algorithm to obtain five potential key genes,and a total of four key genes,GLUD1,VDAC1,DDX3X,and LAMP1,were obtained after validation with GSE32988.RT-qPCR identified the expression levels of DDX3X,LAMP1,GLUD1,and VDAC1 in accordance with the results of bioinformatics analysis of mRNA microarrays;Consequently,DDX3X,LAMP1,GLUD1,and VDAC1 may become new targets for MS therapy.

关键词

多发性硬化症/生物信息学/关键基因/机器学习

Key words

Multiple sclerosis/Bioinformatics/Key genes/Machine learning

分类

医药卫生

引用本文复制引用

黄新蒙,苏凌昊,杨一帆,乔文慧,何家霖,赵培源,刘喜红..机器学习结合生物信息学鉴定多发性硬化症的关键基因[J].生物信息学,2026,24(1):44-56,13.

基金项目

国家自然科学基金青年科学基金项目(No.82104579) (No.82104579)

中国博士后科学基金面上项目(No.2023M731024) (No.2023M731024)

河南省自然科学基金项目(No.202300410258) (No.202300410258)

河南省高等学校青年骨干教师培养计划项目(No.2023GGJS080). (No.2023GGJS080)

生物信息学

1672-5565

访问量1
|
下载量0
段落导航相关论文