福建电脑2025,Vol.41Issue(7):32-36,5.DOI:10.16707/j.cnki.fjpc.2025.07.007
生物序列k-mer并行聚类优化研究
Optimization of Parallel Clustering for Biological Sequence K-mers
摘要
Abstract
To improve the efficiency of biological sequence clustering,this paper proposes a parallel clustering optimization method based on k-mer.By converting biological sequences into binary encoding,constructing feature vectors using k-mer frequency,and utilizing batch processing and parallelization techniques to accelerate the feature extraction process,MiniBatch K-Means algorithm is used for clustering.The experimental results show that the model proposed in this paper significantly improves clustering speed while ensuring clustering accuracy,providing an effective solution for large-scale biological sequence analysis.关键词
生物序列/聚类/K-Means算法Key words
Biological Sequence/Clustering/K-means Algorithm分类
信息技术与安全科学引用本文复制引用
李莉,赵佳旭,徐彭娜..生物序列k-mer并行聚类优化研究[J].福建电脑,2025,41(7):32-36,5.基金项目
本文得到福州职业技术学院科研项目(No.FZYKJJJYB202304)资助. (No.FZYKJJJYB202304)