计算机工程与应用2018,Vol.54Issue(13):84-87,174,5.DOI:10.3778/j.issn.1002-8331.1706-0116
基于分布式文件系统的MPP数据库扫描调度研究
Research on scan scheduling in MPP databases on distributed file systems
摘要
Abstract
MPP (Massive Parallel Processing) database over distributed file systems has become one of research hotspots currently. In order to improve the procedure that schedule execution units to read data blocks before executing query scan operations, a scheduling strategy NLS based on nodes workload is proposed, which combines data locality and nodes workload. On the one hand, the phase of data locality allocating ensures that the scheduling results meet good data locality. On the other hand, reallocating on middle scheduling results based on nodes workload attains the goal of reducing the makespan of scanning data. The experimental results show that compared with the continuity strategy FCS, NLS keeps data locality over 90%. Moreover, the improvement on makespan achieves 32% at most and the average improvement is 25% in all nine test cases.关键词
分布式文件系统/数据库/查询调度/负载优化Key words
distributed file system/database/query scheduling/workload optimization分类
信息技术与安全科学引用本文复制引用
郭凯,龚才鑫,龚奕利,雷迎春..基于分布式文件系统的MPP数据库扫描调度研究[J].计算机工程与应用,2018,54(13):84-87,174,5.基金项目
国家自然科学青年科学基金(No.61100020) (No.61100020)
国家自然科学基金面上项目(No.61572373). (No.61572373)