| 注册
首页|期刊导航|计算机技术与发展|基于Hadoop平台的C4.5算法的分析与研究

基于Hadoop平台的C4.5算法的分析与研究

孙媛 黄刚

计算机技术与发展Issue(11):83-86,90,5.
计算机技术与发展Issue(11):83-86,90,5.DOI:10.3969/j.issn.1673-629X.2014.11.021

基于Hadoop平台的C4.5算法的分析与研究

Analysis and Study of C4 . 5 Algorithm Based on Hadoop Platform

孙媛 1黄刚1

作者信息

  • 1. 南京邮电大学 计算机学院,江苏 南京 210003
  • 折叠

摘要

Abstract

How can dig out the valuable information from the vast amount of data in a more rapid,efficient and low-cost way now be-come a new task faced by the data mining technology. In this paper,in the study of the characteristics of the Hadoop platform and the process of decision tree C4. 5 algorithm,decide to introduce the cloud computing thinking to the field of decision tree algorithm,achieve its parallelization on Hadoop platform and use MapReduce model to solve the problem of massive data mining. Finally with using a round of golf data sets to verify this new algorithm,the results of the experiments show that for the huge amounts of data,the decision tree algo-rithm based on Hadoop platform can significantly improve the efficiency of data mining. It has a good efficiency and scalability. In a cer-tain extent,it also solves the problems of computing huge amounts of data and building the decision tree taking long time that C4. 5 algo-rithm faced when dealing with large amount of calculation.

关键词

Hadoop/MapReduce/数据挖掘/C4.5算法

Key words

Hadoop/MapReduce/data mining/C4. 5 algorithm

分类

信息技术与安全科学

引用本文复制引用

孙媛,黄刚..基于Hadoop平台的C4.5算法的分析与研究[J].计算机技术与发展,2014,(11):83-86,90,5.

基金项目

国家自然科学基金资助项目(61171053) (61171053)

计算机技术与发展

OACSTPCD

1673-629X

访问量4
|
下载量0
段落导航相关论文