计算机工程与科学2017,Vol.39Issue(5):829-833,5.DOI:10.3969/j.issn.1007-130X.2017.05.002
面向ARM64架构多核微处理器的模板计算性能优化研究
Performance optimization of stencil computation on ARM64 multi-core microprocessor
摘要
Abstract
Stencil computation is a class of important calculation kernels widely used in the field ranging from image and video processing to large-scale scientific and engineering simulation and calculation.However,the evaluation of stencil computation on the ARM64 high-performance processor is rare.According to the features of AM-CC X-GENE2 and Phytium FT-1500A,we design an optimization method based on two-dimension bound,which reduces the parallelism overheads of thread scheduling,and increases the Cache hit rate by the thread-CPU bound and thread-data-block bound.Experimental results show that this method can improve the performance of the stencil calculation on ARM64 architecture,and the results of our kernel demonstrate the good scalability on the two ARM64 multi-core microprocessor platforms.关键词
模板计算/ARM64/AMCC X-GENE2/FT-1500A/并行化/线程绑定Key words
stencil computation/ARM64/AMCC X-GENE2/FT-1500A/parallelism/thread bound分类
信息技术与安全科学引用本文复制引用
冯璐霞,李春江,黄亚斌..面向ARM64架构多核微处理器的模板计算性能优化研究[J].计算机工程与科学,2017,39(5):829-833,5.基金项目
国家自然科学基金(61170046) (61170046)
国家863计划(2012AA010903) (2012AA010903)