基于公共数据库的胶质瘤预后相关基因筛选OACSTPCD
Screening of key genes for prognosis of glioma based on public databases
目的:基于公共数据库筛选与胶质瘤临床预后相关的基因.方法:从高通量基因表达(GEO)数据库中获取编号为GSE31095的基因芯片数据,利用R包"limma"筛选差异表达基因,并对其进行基因本体论(GO)功能注释、京都基因和基因组百科全书(KEGG)富集分析.通过String和Cytoscape构建蛋白质相互作用网络图,筛选出hub基因.采用癌症基因组图谱(TCGA)数据库和中国胶质瘤基因组图谱(CGGA)数据库验证hub基因,基于CGGA数据集mRNAseq_325的临床数据对hub基因进行随机数森林分析、Kaplan-Meier分析和Cox比例风险分析,以阐明这些hub基因的诊断和预后效果.结果:筛选出了214个差异表达基因,其中上调的205个,下调的9个.GO分析显示,胶质瘤与生物合成过程、翻译过程、核糖体高度相关;KEGG富集结果显示胶质瘤与免疫系统和抗原呈递关系密切.筛选出10个hub基因,与TCGA和CGGA队列验证的结果一致.基于随机森林算法、Cox回归分析、Kaplan-Meier分析和ggrisk分析结果取交集得到4个基因:RPL7、RPL8、RPL12、RPS7.构建的风险模型受试者工作特征(ROC)曲线下面积在1年时为0.691,3年为0.687,5年为0.685.结论:RPL7、RPL8、RPL12、RPS7的高表达是胶质瘤预后不良因素,可作为胶质瘤临床预后有效的生物标志物以及为新药研发提供新的方向.
OBJECTIVE:Due to the high invasiveness and mortality of glioma,it is necessary to identify prognostic markers,such as glioma-associated hub genes,for improved treatment of this cancer.METHODS:Based on the Gene Expression Omnibus(GEO)database and limma R package,differentially expressed genes of glioma were downloaded,and oxidative stress-related genes based on the Genecard database.GSE31095 dataset(population from Netherlands and Sweden)was downloaded from the GEO database.Based on the GSE31095 dataset and limma R package,differentially expressed genes of glioma were identified.Hub genes were investigated using the protein-protein interaction(PPI),the Gene Ontology(GO),and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analyses.The Cancer Genome Atlas(TCGA)databases(population from the USA)and Chinese Glioma Genome Atlas(CGGA)databases(population from China)were used to verify the hub genes.Subsequently,random forest analysis,Kaplan-Meier analysis,and Cox proportional hazard analysis were conducted on the hub genes using the clinical data from the CGGA databases(mRNAseq_325).These analyses aimed to elucidate the diagnostic and prognostic significance of the identified hub genes.RESULTS:214 differentially expressed genes were identified,of which 205 were up-regulated and 9 were down-regulated.GO function enrichment analysis yielded 3 entries,including biosynthetic processes,translation processes,and ribosomes.The KEGG pathway enrichment analysis yielded 2 signaling pathways which were mainly involved in the immune system and antigen presentation.Ten hub genes were selected,and they were consistent with the results verified by the TCGA and CGGA cohorts.Four key genes,RPL7,RPL8,RPS3A,and RPS7,were identified with the overlap results from random forest algorithm,KM,and ggrisk analyses.The area under the ROC curve for the risk model for prognosis of gliomas was 0.691 at 1 year,0.687 at 3 years,and 0.685 at 5 years.CONCLUSION:Utilizing bioinformatics methods,the identification of hub genes in gliomas showed a novel avenue that could serve as a reference point for both clinical prognostic assessment and the development of new therapeutic strategies.
张艺;高含;郑展越;谭启涛;杨敏丽;孙艳
桂林医学院公共卫生学院,广西 桂林 541199
临床医学
胶质瘤生物信息学关键基因预后
gliomabioinformaticshub geneprognosis
《癌变·畸变·突变》 2024 (003)
195-201 / 7
广西自然科学基金(2023GXNSFAA026035)
评论