陕西科技大学学报2025,Vol.43Issue(5):200-206,7.
混合属性数据集的自适应谱聚类算法
Adaptive spectral clustering algorithm for mixed attribute dataset
摘要
Abstract
To address the attribute skew problem in spectral clustering algorithms when han-dling mixed attribute datasets and the manual selection of Gaussian kernel function scale pa-rameters,we propose an improved classification attribute similarity measurement and shared natural neighbor adaptive spectral clustering algorithm(IEMN-SC).This algorithm improves the traditional classification attribute similarity measurement by calculating the information entropy of numerical attributes and categorical attributes to obtain a balancing difference fac-tor.In the Gaussian kernel function,it uses shared natural neighbors to compute the neigh-borhood radius of each sample and adaptively solves the scale parameter.Finally,it constructs a similarity matrix of mixed attribute samples through the kernel function for spectral cluste-ring.Experimental results show that the IEMN-SC algorithm outperforms four commonly used mixed attribute data clustering algorithms in terms of ACC,ARI,and NMI metrics and provides more stable clustering results.This algorithm effectively solves the attribute skew problem and can fully adaptively discover the true distribution information of mixed attribute datasets,significantly improving clustering efficiency.关键词
信息熵/谱聚类/混合属性数据集/自然近邻/自适应Key words
information entropy/spectral clustering/mixed-attribute dataset/natural neigh-bors/adaptive分类
信息技术与安全科学引用本文复制引用
陈玉,汪宇欣,苏欢..混合属性数据集的自适应谱聚类算法[J].陕西科技大学学报,2025,43(5):200-206,7.基金项目
山东省自然科学基金面上项目(ZR202102220411) (ZR202102220411)