计算机工程与应用2016,Vol.52Issue(17):79-83,159,6.DOI:10.3778/j.issn.1002-8331.1601-0234
基于Hadoop的多关键字排序方法研究
Study of multi-keywords sorting method based on Hadoop
摘要
Abstract
It takes a long time to sort big data by multi-keywords with single machine. In order to improve the efficiency of sorting, two methods of multi-keywords sort are given according to MapReduce model of Hadoop. In method one, chain radix sort algorithm is used by Reduce function to sort big data by multi-keywords in parallel, which can improve the efficiency of sorting with multiple nodes. In method two, composite key and comparator are defined, which imple-ments multi-keywords comparison between records by byte so that it can save more time on deserializing objects. The per-formance of the two methods is tested by experiments. The experimental results show that the two methods can achieve high sorting efficiency and good scalability.关键词
Hadoop/MapReduce模型/多排序/基数排序Key words
Hadoop/MapReduce model/multi-keywords sort/radix sort分类
信息技术与安全科学引用本文复制引用
周国军..基于Hadoop的多关键字排序方法研究[J].计算机工程与应用,2016,52(17):79-83,159,6.基金项目
广西自然科学基金(No.2014GXNSFBA118015);玉林师范学院校级科研项目(No.2014YJYB03)。 ()