| 注册
首页|期刊导航|天津科技大学学报|基于生成模型和基因表达数据的关键基因筛选

基于生成模型和基因表达数据的关键基因筛选

余钱 李雨蒙 罗军伟 董浩帆 李玉 吴信

天津科技大学学报2025,Vol.40Issue(6):1-8,46,9.
天津科技大学学报2025,Vol.40Issue(6):1-8,46,9.DOI:10.13364/j.issn.1672-6510.20240105

基于生成模型和基因表达数据的关键基因筛选

Key Gene Screening Based on Generative Models and Gene Expression Data

余钱 1李雨蒙 2罗军伟 3董浩帆 1李玉 4吴信2

作者信息

  • 1. 天津科技大学生物工程学院,天津 300457||中国科学院天津工业生物技术研究所,天津 300308
  • 2. 中国科学院天津工业生物技术研究所,天津 300308
  • 3. 河南理工大学软件学院,焦作 454003
  • 4. 天津科技大学生物工程学院,天津 300457
  • 折叠

摘要

Abstract

Gene expression data can elucidate the pathological mechanisms of diseases under specific conditions and times.However,the"curse of dimensionality"phenomenon characterised by small samples and high dimensions,constrains the performance of traditional machine learning classification methods.This results in low prediction accuracy,an inability to recognise small samples,and poor stability.This article introduces a novel method,namely CVAE-CWGNA-DAE,which integrates data augmentation and gene selection in order to address the issues that arise from the"curse of dimensionality".Firstly,in order to address the issue of the small sample size in gene expression data,a data augmentation method is pro-posed,which combines a conditional variational autoencoder with a gradient penalty-based conditional Wasserstein genera-tive adversarial network.A comparison with existing methods demonstrates the superiority of this approach in terms of clas-sification performance and stability.Secondly,to address the high dimensionality in gene expression data and verify the ef-fectiveness of the generated data,this article employs a gene selection method based on a denoising autoencoder and SVM-RFE.The results reveal that the use of the augmented dataset for gene selection has resulted in an improvement in the accu-racy of selected genes across five distinct classification tasks.Therefore,these results demonstrate the effectiveness of the proposed method in addressing the"curse of dimensionality"and achieving significant improvements in gene selection.

关键词

基因表达/维数灾难/数据增强/基因选择/自编码器/生成对抗网络

Key words

gene expression/curse of dimensionality/data augmentation/gene selection/autoencoder/generative adversarial network

分类

信息技术与安全科学

引用本文复制引用

余钱,李雨蒙,罗军伟,董浩帆,李玉,吴信..基于生成模型和基因表达数据的关键基因筛选[J].天津科技大学学报,2025,40(6):1-8,46,9.

基金项目

国家自然科学基金资助项目(62372156) (62372156)

天津科技大学学报

1672-6510

访问量0
|
下载量0
段落导航相关论文