桂林电子科技大学学报2012,Vol.32Issue(4):307-312,6.
基于Hadoop MapReduce的大规模数据索引构建与集群性能分析
Large scale data index construction and cluster efficiency analysis based on Hadoop MapReduce
摘要
Abstract
In order to satisfy the search engine's requirements of time and space and build effectively distributed index, Hadoop is used to build a distributed cluster environment) and large data inverted index can be achieved based on the MapReduce programming. The performance of the Hadoop cluster is evaluated by different network bandwidth) data volume and number of cluster nodes. Experimental results show that the greater network bandwidth is, the higher efficiency of cluster processing is< the more cluster nodes are, the stronger the ability to handle large data is. The performance of Hadoop cluster is influenced by the network communication bandwidth) high-speed cluster link can improve the performance of the cluster.关键词
MapReduce/倒排索引/Hadoop集群Key words
MapReducer inverted index/ Hadoop cluster分类
信息技术与安全科学引用本文复制引用
谌超,强保华,石龙..基于Hadoop MapReduce的大规模数据索引构建与集群性能分析[J].桂林电子科技大学学报,2012,32(4):307-312,6.基金项目
国家自然科学基金(61163057) (61163057)