吉林大学学报(医学版)2025,Vol.51Issue(2):437-446,10.DOI:10.13481/j.1671-587X.20250218
基于GEO数据库对多原发肺癌差异表达基因的生物信息学分析
Bioinformatics analysis on differentially expressed genes in multiple primary lung cancers based on GEO database
摘要
Abstract
Objective:To screen out the differentially expressed genes(DEGs)in multiple primary lung cancers(MPLCs)using bioinformatics methods,and to analyze their biological functions and their influence in the prognosis of lung adenocarcinoma.Methods:Single-cell transcriptome sequencing data(GSE200972)was downloaded from the Gene Expression Omnibus(GEO)database.After preliminary data processing with R software,the Seurat R package was used for data processing,cell clustering,and annotation.The clusterProfiler R package was used for Gene Set Enrichment Analysis(GSEA).The STRING database and Cytoscape software were employed to construct the protein-protein interaction(PPI)network and to screen out the key genes(Hub genes).The gene expression levels in the lung adenocarcinoma database were analyzed using Gene Expression Profiling Interactive Analysis(GEPIA)database.Real-time fluorescence quantitative PCR(RT-qPCR)method was used to detect the gene expression in tumor tissue of A549 xenograft mice and lung tissue of normal mice.Kaplan-Meier Plotter was used for prognosis analysis.Results:Seven cell types were identified from cell clustering analysis,which were epithelial cells,endothelial cells,fibroblasts,T cells and natural killer(NK)cells,B cells,myeloid cells,and mast cells.A total of 14 605 DEGs were screened out between tumor epithelial cells and normal epithelial cells.The GSEA results revealed four activated pathways in tumor samples[myelocytomatosis oncogene(MYC)pathway,P53 pathway,oxidative phosphorylation pathway and glycolysis pathway]and one inhibited pathway[tumor necrosis factor-α(TNF-α)and nuclear factor kappa B(NF-κB)pathway].The Hub genes identified from PPI network included CXC motif chemokine ligand 8(CXCL8),glyceraldehyde-3-phosphate dehydrogenase(GAPDH),CXC motif chemokine receptor 4(CXCR4),kirsten rat sarcoma viral proto-oncogene(KRAS),CXC motif chemokine ligand 1(CXCL1),C-C motif chemokine ligand 2(CCL2),mucin 1(MUC1),and secreted phosphoprotein 1(SPP1).The GEPIA database analysis and animal experiments showed that the expression levels of SPP1 mRNA in non-small cell lung cancer tissue were increased compared with normal lung tissue(P<0.01).The Kaplan-Meier survival analysis indicated that the patients with high expression level of SPP1 had shorter overall survival(OS)than those with low expression level(P<0.01).Conclusion:There are activation of oncogene-related pathways and activation of tumor suppressor pathway antagonizing tumor progression in MPLCs.Moreover,elevated expression of SPP1 in non-small cell lung cancer may indicate a relatively poor prognosis.关键词
多原发肺癌/单细胞转录组测序/生物信息学/基因集富集分析/生存分析Key words
Multiple primary lung cancer/Single-cell transcriptome sequencing/Bioinformatics/Gene set enrichment analysis/Survival analysis分类
医药卫生引用本文复制引用
刘博,孙超,王旭,马克威..基于GEO数据库对多原发肺癌差异表达基因的生物信息学分析[J].吉林大学学报(医学版),2025,51(2):437-446,10.基金项目
国家自然科学基金项目(81501962) (81501962)
吉林大学科研项目(Z61007523124090432) (Z61007523124090432)
吉林大学横向科研项目(2018175) (2018175)