计算机应用与软件2016,Vol.33Issue(8):62-66,5.DOI:10.3969/j.issn.1000-386x.2016.08.014
基于数据集相似性的分类算法推荐
CLASSIFICATION ALGORITHMS RECOMMENDATION BASED ON DATASET SIMILARITY
摘要
Abstract
In recent years,with the rapid growth of big data analysis demand,the use of classification algorithms is more and more wide-spread as well,it becomes an urgent application problem of data mining technology to be solved that how to select appropriate classification al-gorithms for users.Correlated studies have shown that the same classifier has similar performance on similar datasets.According to this theory, we build a dataset similarity-based classification algorithms selection model.First,it extracts features of datasets using dataset discretisation method to create the sample base,then it recommends the appropriate classification algorithms for datasets in combination with the principle of nearest neighbour similarity.Experiment of algorithm has been carried out on datasets of UCI repository,results show that the recommended classifiers perform well in most cases.关键词
数据挖掘/分类算法/数据集特征/性能评价/算法推荐Key words
Data mining/Classification algorithm/Dataset features/Performance evaluation/Algorithm recommendation分类
信息技术与安全科学引用本文复制引用
李洪奇,徐青松,朱丽萍,戚雪晨..基于数据集相似性的分类算法推荐[J].计算机应用与软件,2016,33(8):62-66,5.基金项目
十二五国家重大专项(2011ZX05020-009);北京市重点实验室阶梯计划项目(Z121109009212008)。 ()