| 注册
首页|期刊导航|山西大学学报(自然科学版)|基于类内距离参数估计的文本聚类评价方法

基于类内距离参数估计的文本聚类评价方法

牛奉高 张荣杰

山西大学学报(自然科学版)2018,Vol.41Issue(2):256-266,11.
山西大学学报(自然科学版)2018,Vol.41Issue(2):256-266,11.DOI:10.13451/j.cnki.shanxi.univ(nat.sci.).2018.02.002

基于类内距离参数估计的文本聚类评价方法

Text Clustering Evaluation Method based on Parameter Estimation of Distances within Clusters

牛奉高 1张荣杰1

作者信息

  • 1. 山西大学 数学科学学院,山西太原030006
  • 折叠

摘要

Abstract

Text clustering evaluation method based on parameter estimation of distances within clusters uses the maximum likelihood estimation to estimate the parameters of the distances' distribution function,on the basis of the inner distance following the normal distribution approximately.According to the result of estimation,the logical range of the within-class distance is determined,and the over range text vectors are adjusted in accordance with the size of distances within classes.But the final result must be validated by the clustering evaluation indexes.This paper not only validates text clustering evaluation method based on parameter estimation of distances within clusters is feasible while the number of clusters is too small or at the same time as the true class number,but also weakens the influence of initial class center selection on K-means algorithm and improves the accuracy of clustering results,by using text clustering evaluation method based on parameter estimation of distances within clusters to adjust clustering results generated by K-means algorithm.

关键词

类内距离/极大似然估计/聚类评价/K-means算法/聚类调整

Key words

the within-class distance/maximum likelihood estimation/clustering evaluation/K-means algorithm/clustering adjustment

分类

社会科学

引用本文复制引用

牛奉高,张荣杰..基于类内距离参数估计的文本聚类评价方法[J].山西大学学报(自然科学版),2018,41(2):256-266,11.

基金项目

国家自然科学基金(71503151) (71503151)

山西省高等学校创新人才支持计划(2016052006) (2016052006)

山西大学学报(自然科学版)

OA北大核心CSTPCD

0253-2395

访问量0
|
下载量0
段落导航相关论文