首页|期刊导航|中国农业科学(英文版)|Comparison of Supervised Clustering Methods for the Analysis of DNA Microarray Expression Data
中国农业科学(英文版)2008,Vol.7Issue(2):129-139,11.
Comparison of Supervised Clustering Methods for the Analysis of DNA Microarray Expression Data
Comparison of Supervised Clustering Methods for the Analysis of DNA Microarray Expression Data
摘要
Abstract
Several typical supervised clustering methods such as Gaussian mixture model-based supervised clustering(GMM),knearest-neighbor(KNN),binary support vector machines(SVMs)and multiclass support vector machines(MC-SVMs)were employed to classify the computer simulation data and two real microarray expression datasets.False positive,false negative,true positive,true negative,clustering accuracy and Matthews' correlation coefficient (MCC) were compared among these methods.The results are as follows:(1)In classifying thousands of gene expression data,the performances of two GMM methods have the maximal clustering accuracy and the least overall FP+FN error numbers on the basis of the assumption that the whole set of microarray data are a finite mixture of multivariate Gaussian distributions.Furthermore,when the number of training sample is very small,the clustering accuracy of GMM-II method has superiority over GMMI method.(2)In general,the superior classification performance of the MC-SVMs are more robust and more practical,which are less sensitive to the curse of dimensionality,and not only next to GMM method in clustering accuracy to thousands of gene expression data,but also more robust to a small number of high-dimensional gene expression samples than other techniques.(3)Of the MC-SVMs,OVO and DAGSVM perform better on the large sample sizes,whereas five MC-SVMs methods have very similar performance on moderate sample sizes.In other cases,OVR,WW and CS yield better results when sample sizes are small.So,it is recommended that at least two candidate methods,choosing on the basis of the real data features and experimental conditions,should be performed and compared to obtain better clustering result.关键词
microarray,supervised clustering,k-nearest-neighbor(KNN),support vector machines (SVMs)Key words
microarray,supervised clustering,k-nearest-neighbor(KNN),support vector machines (SVMs)分类
农业科技引用本文复制引用
XIAO Jing,WANG Xue-feng,YANG Ze-feng,XU Chen-wu..Comparison of Supervised Clustering Methods for the Analysis of DNA Microarray Expression Data[J].中国农业科学(英文版),2008,7(2):129-139,11.基金项目
This research was supported by the National Natural Science Foundation of China(30370758)and Program for New Century Excellent Talents in Universities(NCET) of Ministry of Education to Dr.Xu Chenwu(NCET-05-0502). (30370758)