计算机工程与应用Issue(10):1-7,7.DOI:10.3778/j.issn.1002-8331.1612-0497
一种面向HDFS的数据随机访问方法
Data random access method oriented to HDFS
摘要
Abstract
In order to simplify the realization of the file system, HDFS sacrifices the file's random access feature to support streaming access for large data set. But in the actual scene, many applications require random access to the file. After in-depth analysis of HDFS data reading and writing principle, a data random access method oriented to HDFS is proposed. The idea is to add data access interface for Blocks on Datanode, the user program can read the Block file stored on the Datanode and write the data to the Block storage directory. The first file replica is written to the local Datanode by user program, the rest replicas produced by copy of the first replica stored on other Datanodes. In addition, add the permissions management for Block, the file replicas stored on Datanodes belongs to the user. If the file permissions changed in the namespace, the Block permissions also changed. Test results show that data read and write performance is improved about 10% and 20% separately, the write performance can be increased by 2.5 times under the high concurrency.关键词
Hadoop分布式文件系统/随机访问/权限管理Key words
Hadoop Distributed File System/random access/permission management分类
信息技术与安全科学引用本文复制引用
李强,孙震宇,孙功星..一种面向HDFS的数据随机访问方法[J].计算机工程与应用,2017,(10):1-7,7.基金项目
国家自然科学基金(No.11375223,No.11375221). (No.11375223,No.11375221)