计算机科学与探索2025,Vol.19Issue(11):2873-2894,22.DOI:10.3778/j.issn.1673-9418.2502024
深度学习方法下的文本聚类模型研究进展
Advances in Text Clustering Models Based on Deep Learning Approaches
摘要
Abstract
Text clustering is one of the core techniques in unsupervised learning,aiming to automatically partition large text datasets into clusters with high semantic similarity.In recent years,deep learning-based text clustering has flourished,with research focus shifting towards utilizing advanced deep learning architectures to efficiently extract text features,thereby improving clustering accuracy.Particularly,clustering strategies relying on large pre-trained language models like RoBERTa and GPT have demonstrated exceptional performance due to their powerful pre-trained feature representations.Through examples and data,this paper comprehensively reviews the development,current progress,and task characteristics of text clustering,aiming to present its latest trends and significant impact in data mining.An innovative classification method for text clustering models based on deep learning architecture features is proposed.This classification method divides models based on their core mechanisms and feature extraction paths in clustering tasks,covering a comprehensive intro-duction to methods ranging from traditional clustering algorithms to advanced technologies,including K-means,spectral clustering,autoencoders,generative models,graph convolutional networks,and large language models,with detailed anal-ysis of their specific implementations.Finally,the advantages and limitations of existing methods are analyzed,and poten-tial future research directions are discussed.关键词
特征表示/文本聚类/深度学习/大语言模型Key words
feature representation/text clustering/deep learning/large language models分类
计算机与自动化引用本文复制引用
史东艳,马乐荣,丁苍峰,宁秦伟,曹江江..深度学习方法下的文本聚类模型研究进展[J].计算机科学与探索,2025,19(11):2873-2894,22.基金项目
延安大学"十四五"中长期重大科研项目(2021ZCQ012) (2021ZCQ012)
延安大学产学研合作培育项目(CXY202107) (CXY202107)
陕西省特支计划人才项目(YAU202305399). This work was supported by the 14th Five-Year Plan Mid-long Term Major Research Program of Yan'an University(2021ZCQ012),the Industry-University-Research Cooperation Cultivation Program of Yan'an University(CXY202107),and the Shaanxi Provincial Special Support Talent Program(YAU202305399). (YAU202305399)