|国家科技期刊平台
首页|期刊导航|山西大学学报(自然科学版)|基于变分自编码器与流形特征的聚类算法

基于变分自编码器与流形特征的聚类算法OACSTPCD

Clustering Algorithm Based on Variational Autoencoder and Manifold Features

中文摘要英文摘要

深度神经网络因具有优良的非线性映射能力和在不同场景下的灵活性而成为一种很有前景的聚类方法.为了得到易于聚类的特征,许多深度聚类方法从原始数据中提取特征是通过将原始数据映射到一个更低维的空间方式,而聚类指派依然假设是在欧式空间进行.为了探究特征提取和流形空间对聚类性能的影响,本文提出了一种基于变分自编码器与流形特征的聚类算法——MFVC(Clustering Algorithm Based on Variational Autoencoder and Manifold Features).该方法通过具有残差连接层及无参数注意力机制SimAM(A Simple,Parameter-Free Attention Module for Convolutional Neural Networks)组成的β-VAE(Learning Basic Visual Concepts with a Constrained Varia-tional Framework)做特征提取器提取图像特征,采用流形UMAP(Uniform Manifold Approximation and Projection for Dimension Reduction)方法提高特征的可分离性,然后用K-Means方法进行聚类学习.在6个基准数据集的实验结果表示该方法能够提供较好的性能,MFVC在MNIST(Mixed NationalInstitute of Standards and Technology da-tabase)数据集上能够达到0.981的精度,在Fashion-MNIST数据集上能够达到0.681的精度.

Deep neural network has become a promising clustering method due to its excellent nonlinear mapping ability and flexibil-ity in different scenarios.In order to map the original high-dimensional data to a feature space in where the clustering is easy to be done,feature extraction or feature transformationare are done by many deep clustering methods,and then the extracted features are grouped into different clusters in the lower-dimensional space,which still are assumed in Euclidean space.In order to explore the im-pact of feature extraction and manifold space on clustering performance,in this paper,we propose a clustering algorithm based on variational autoencoder and manifold learning—MFVC(Clustering Algorithm Based on Variational Autoencoder and Manifold Fea-tures).In this method,the β-VAE(Learning Basic Visual Concepts with a Constrained Variational Framework)with residual connec-tion layer is used as a feature extractor to extract image features,and the non-parameter attention mechanism SimAM(A Simple,Pa-rameter-Free Attention Module for Convolutional Neural Networks)is added to improve the expressive ability of the convolutional network.For more favorable features,the Manifold UMAP(Uniform Manifold Approximation and Projection for Dimension Reduc-tion)method is used to improve the separability of the features,and then the K-Means method is used for clustering learning.Experi-mental results on six benchmark datasets show that this method can provide better performance.MFVC achieves with accuracy of 0.981 on the MNIST(Mixed NationalInstitute of Standards and Technology database)dataset,and 0.681 on the Fashion-MNIST da-taset.

陈俊芬;韩金池;谢博鋆;谢政豪

河北大学 数学与信息科学学院 河北省机器学习与计算智能重点实验室,河北 保定 071002

计算机与自动化

变分自编码器残差连接UMAPK-Means流形学习

variational autoencoderresidual connectionUMAPK-Meansmanifold learning

《山西大学学报(自然科学版)》 2024 (001)

69-80 / 12

河北省引进留学人员资助项目(C20200302);河北省教育教学改革研究与实践项目(2020GJJG007)

10.13451/j.sxu.ns.2023139

评论