| 注册
首页|期刊导航|华中师范大学学报(自然科学版)|基于机器学习和氨基酸位置相关系数法的HPV进化关系和亚型分类研究

基于机器学习和氨基酸位置相关系数法的HPV进化关系和亚型分类研究

胡画霖 何黎黎 刘茂省

华中师范大学学报(自然科学版)2026,Vol.60Issue(2):308-320,13.
华中师范大学学报(自然科学版)2026,Vol.60Issue(2):308-320,13.DOI:10.19603/j.cnki.1000-1190.2026.02.013

基于机器学习和氨基酸位置相关系数法的HPV进化关系和亚型分类研究

Evolutionary relationships and genotyping of HPV based on machine learning and amino acid position correlation coefficient method

胡画霖 1何黎黎 1刘茂省1

作者信息

  • 1. 北京建筑大学理学院,北京 102616
  • 折叠

摘要

Abstract

In this study,a non-sequence-alignment method based on amino acid positional information,namely the amino acid correlation coefficient feature vector(ACCFV)method,was proposed for evolutionary analysis and genotyping of human papillomavirus(HPV).Traditional multiple sequence alignment(MSA)methods suffer from low computational efficiency and high memory consumption when processing large-scale datasets.In contrast,the ACCFV method overcomes these limitations by constructing statistical measures of positional correlations between amino acids and converting amino acid sequences into numerical feature vectors.Amino acid sequences of eight HPV proteins(E6,E7,E1,E2,E4,E5,L1,and L2)were selected as target data.After feature extraction using ACCFV,a phylogenetic tree was constructed based on Euclidean distances between feature vectors,and four machine learning models were employed for classification prediction.The results showed that when the delay step size L=1,the ACCFV method achieved high consistency with the traditional MSA tool Muscle in evolutionary analysis,while significantly improving computational efficiency.Moreover,the Random Forest model achieved 100%classification accuracy.Compared to BLAST-Protein,ACCFV maintained 100%accuracy while substantially reducing processing time and required no batch operations.This study not only validates the feasibility and effectiveness of the ACCFV method in HPV research but also provides a novel technical approach for molecular epidemiological studies of other viruses.

关键词

HPV/氨基酸序列/机器学习/进化分析/亚型分类

Key words

HPV/amino acid sequence/machine learning/evolutionary analysis/subtype classification

分类

生物科学

引用本文复制引用

胡画霖,何黎黎,刘茂省..基于机器学习和氨基酸位置相关系数法的HPV进化关系和亚型分类研究[J].华中师范大学学报(自然科学版),2026,60(2):308-320,13.

基金项目

国家自然科学基金项目(12571522) (12571522)

北京建筑大学高层次人才引进资助计划项目(GDRC20220802) (GDRC20220802)

2024年度北京市数字教育研究课题(青年课题)(BDEC2024QN081) (青年课题)

北京市教育委员会2024年度科研计划一般项目(KM202410016001) (KM202410016001)

2024年北京市高等教育学会课题(MS2024130). (MS2024130)

华中师范大学学报(自然科学版)

1000-1190

访问量0
|
下载量0
段落导航相关论文