| 注册
首页|期刊导航|现代电子技术|基于Hadoop的网络日志挖掘方案的设计

基于Hadoop的网络日志挖掘方案的设计

许抗震 吴云

现代电子技术2017,Vol.40Issue(9):115-120,6.
现代电子技术2017,Vol.40Issue(9):115-120,6.DOI:10.16652/j.issn.1004-373x.2017.09.031

基于Hadoop的网络日志挖掘方案的设计

Design of Web log mining scheme based on Hadoop

许抗震 1吴云1

作者信息

  • 1. 贵州大学 计算机科学与技术学院,贵州 贵阳 550025
  • 折叠

摘要

Abstract

A thought of mining the Web log data with exponent level is put forward. A high reliability Web log data mining scheme was designed. Aiming at the available public Web log dataset,the filtering algorithm based on MapReduce was imple-mented in the data preprocessing stage to mine the service information supporting the enterprise decision. The platform estab-lished with this scheme is optimized,and its performance is increased by 3.26%. The effect of the scheme's high reliability and log file quantity on the I/O speed of the platform,and the comparison of the platform with the single machine in the aspect of query performance were tested. The results show that the designed scheme is reliable,double increased with the increase of the log file quantity,the time cost of the read operation is increased by 52.58% averagely,and the time cost of the write operation is in-creased by 79.69%. With the increase of the log quantity,the query time cost of the single machine is increased rapidly,and the query time cost of the platform is stable. With the increase of the machine nodes,the computational time cost is decreased by 8.87% averagely.

关键词

网络日志/数据挖掘/数据清洗/Hadoop/MySQL

Key words

Web log/data mining/data filtering/Hadoop/MySQL

分类

信息技术与安全科学

引用本文复制引用

许抗震,吴云..基于Hadoop的网络日志挖掘方案的设计[J].现代电子技术,2017,40(9):115-120,6.

基金项目

国家自然科学基金项目(NSF61370161) (NSF61370161)

贵州省科学技术基金项目(黔科合J字[2010]2100) (黔科合J字[2010]2100)

贵州大学博士基金项目(贵大人基合字(2009)029) (贵大人基合字(2009)

现代电子技术

OA北大核心CSTPCD

1004-373X

访问量0
|
下载量0
段落导航相关论文