中国组织工程研究2025,Vol.29Issue(35):7679-7689,11.DOI:10.12307/2025.971
机器学习联合生物信息学筛选与自噬相关的肺纤维化关键基因及实验验证
Machine learning combined with bioinformatics screening of key genes for pulmonary fibrosis associated with cellular autophagy and experimental validation
摘要
Abstract
BACKGROUND:Early diagnosis of pulmonary fibrosis is the foundation for timely antifibrotic drug therapy.Therefore,exploring and discovering ideal biomarkers that can be effectively used for the early diagnosis of pulmonary fibrosis is crucial for the treatment of the disease. OBJECTIVE:To conduct an in-depth analysis of key autophagy-related genes involved in the process of pulmonary fibrosis by means of bioinformatics and machine learning techniques,in order to investigate whether autophagy-related core genes of pulmonary fibrosis can be used as reliable biomarkers in the assessment of the progression of pulmonary fibrosis. METHODS:Two datasets of pulmonary fibrosis,GSE24206 and GSE110147,were downloaded from the Gene Expression Omnibus(GEO)database(a public database developed and maintained by the U.S.National Center for Biotechnology Information to store and share bioinformatics data),and the gene expression matrices of these two datasets were normalized by using the"limma"package in R software.The autophagy-related genes were extracted from GeneCards database(a database created by the U.S.National Center for Biotechnology Information,which automatically integrates gene-centric data from about 200 Web sources,including genomic,transcriptomic,proteomic,genetic,clinical,and functional information).Differential gene analysis was performed on the pulmonary fibrosis dataset,and the common genes were extracted by cross-comparing the differential genes with the autophagy genes,so as to identify autophagy genes that may play a role in the process of pulmonary fibrosis.The intersecting genes were analyzed for functional enrichment and cellular immune infiltration by gene ontology and Kyoto Encyclopedia of Genes and Genomes.Core genes of pulmonary fibrosis associated with autophagy were screened by protein-protein interactions and machine learning,and core genes were subjected to the enrichment analysis.Diagnostic models were constructed from the identified core genes.Calibration curves were used to assess the predictive ability of the line graph model.An external dataset,GSE21369,was used to perform a receiver operating characteristic curve analysis to validate the expression profiles of pulmonary fibrosis genes associated with autophagy,as well as to predict Chinese herbs associated with the genes IL6 and COL1A2 via the Coremine database.Finally,human embryonic lung fibroblasts were cultured and modelled by transforming growth factor-β1 treatment,and the relative expression of genes in the model cells was verified using qRT-PCR. RESULTS AND CONCLUSION:(1)A total of 51 pulmonary fibrosis differential genes and 25 genes intersecting with autophagy genes were obtained.Gene ontology analysis showed that the 25 intersecting genes were related to extracellular matrix tissue,collagen metabolism,collagen pro-fibroblasts,and growth factor binding,etc.The results of Kyoto Encyclopedia of Genes and Genomes enrichment analysis indicated that they were mainly related to the Phosphatidylinositol 3-kinase/protein kinase B signaling pathway and the signaling pathway of the extracellular matrix-receptor interactions.(2)Immunoinfiltration analysis revealed that the expression of activated memory CD4+T cells,M0 macrophages,and resting dendritic cells was significantly elevated in the pulmonary fibrosis group(P<0.05),showing a strong correlation.(3)Two autophagy signature genes involved in the progression of pulmonary fibrosis were identified:COL1A2 and IL6.The column-line diagram model showed that the two core genes predicted the onset of pulmonary fibrosis more accurately,and the receiver operating characteristic curve analysis showed that the two characteristic genes had diagnostic significance.COL1A2 and IL6 were related to the cell-cycle pathway,mitogen-activated protein kinase signaling pathway,Janus kinase-signal transduction and activator of transcription signaling pathway and cytokine-cytokine receptor interactions.A total of 20 Chinese herbs were predicted to be related to COL1A2 and IL6 genes,and their efficacies were mainly to clear away heat and detoxify toxins and to invigorate blood and move qi.COL1A2 and IL6 were verified to be highly expressed in pulmonary fibrosis.To conclude,COL1A2 and IL6 may be potential diagnostic biomarkers for pulmonary fibrosis,but its specificity to pulmonary fibrosis needs to be further investigated.关键词
肺纤维化/自噬/机器学习/生物信息学/免疫浸润/最小绝对收缩与选择算子/基因富集分析/工程化组织构建Key words
pulmonary fibrosis/autophagy/machine learning/bioinformatics/immune infiltration/minimum absolute contraction and the selection operator/gene enrichment analysis/engineered tissue construction分类
医药卫生引用本文复制引用
巩月红,王梦君,任航,郑辉,孙佳佳,刘军鹏,张飞,杨建华,胡君萍..机器学习联合生物信息学筛选与自噬相关的肺纤维化关键基因及实验验证[J].中国组织工程研究,2025,29(35):7679-7689,11.基金项目
新疆维吾尔自治区卫生健康委员会"天山英才"医药卫生高层次人才培养计划项目(TSYC202301B095),项目负责人:巩月红 (TSYC202301B095)
新疆维吾尔自治区大学生创新训练计划项目(S202310760059),项目负责人:王梦君 (S202310760059)
新疆维吾尔自治区科学技术厅自然科学基金重点项目(2021D01D11),项目负责人:胡君萍 (2021D01D11)
新疆医科大学第一附属医院创新团队培养项目(党字[2023]52号),项目负责人:杨建华 The"Tianshan Talents"High-level Medical and Health Personnel Training Plan of the Health Commission of Xinjiang Uygur Autonomous Region,No.TSYC202301B095(to GYH) (党字[2023]52号)
Xinjiang Medical University Student Innovation Training Program,No.S202310760059(to WMJ) (to WMJ)
Natural Science Foundation of Xinjiang Uygur Autonomous Region,No.2021D01D11(to HJP) (to HJP)
Innovation Team Training Project of the First Affiliated Hospital of Xinjiang Medical University,No.[2023]52(to YJH) (to YJH)