华南农业大学学报2025,Vol.46Issue(4):549-557,9.DOI:10.7671/j.issn.1001-411X.202409024
机器学习模型对猪基因表达量预测准确性的评估
Evaluation of predictive accuracy of gene expression in pigs using machine learning models
摘要
Abstract
[Objective]The goal was to compare the performance of various machine learning models in predicting gene expression in pigs utilizing single nucleotide polymorphisms(SNPs),and to investigate the relationship between cis-heritability(cis-h2),the number of cis-SNPs and the prediction accuracy of different models.[Method]Based on the protein encoding genes of pigs derived from muscle tissue of the PigGTEx project,we trained 18 distinct machine learning models by employing cis-SNPs located within a±1 Mb window from the transcription start sites of genes.Subsequently,we evaluated the prediction accuracy of each model.[Result]There was a positive correlation between the prediction accuracy of machine learning models and the cis-h2 of genes.Notably,the elastic net regression model and the Lasso regression model exhibited the highest overall prediction accuracy,with the means of R2 being 0.0362 and 0.0358,respectively.Furthermore,there was a positive correlation between the prediction accuracy of these machine learning models and the number of cis-SNPs around the genes within certain range.[Conclusion]The accuracy of utilizing machine learning models to predict gene expression in pigs is largely influenced by both cis-h2 and the number of cis-SNPs of genes.Therefore,selecting an appropriate machine learning model tailored to the specific cis-h2 and the number of cis-SNPs of different genes can be advantageous in enhancing the accuracy for predicting pig gene expression levels.关键词
机器学习/猪/基因表达量/全转录组关联分析/顺式遗传力/顺式SNPKey words
Machine learning/Pig/Gene expression/Transcriptome-wide association study/cis-heritability/cis-SNP分类
信息技术与安全科学引用本文复制引用
周天乐,滕金言,徐志婷,张哲..机器学习模型对猪基因表达量预测准确性的评估[J].华南农业大学学报,2025,46(4):549-557,9.基金项目
国家生猪产业技术体系(CARS-35) (CARS-35)
国家重点研发计划(2022YFF1000900) (2022YFF1000900)