计算机科学与探索Issue(10):1153-1162,10.DOI:10.3778/j.issn.1673-9418.1412057
Intel多核与集成众核上CFD程序的OpenMP性能分析
OpenMP performance analysis of CFD application on Intel multicore and manycore architectures
摘要
Abstract
Multicore and manycore are becoming mainstream architectures in high performance computing. OpenMP programming is one of the primary methods to exploit the parallel computing capabilities of them. By using a sys-tematic approach which incorporates hardware performance counter based measurement and model based analysis, this paper evaluates the OpenMP performance of a real-world high order structured grids based CFD (computational fluids dynamics) application on Xeon E5 Sandy Bridge, an Intel multicore processor, and Knights Corner, an Intel many integrated core coprocessor. This paper analyzes the performance impacts of the OpenMP library cost, the load balance among different OpenMP threads, and the memory bandwidth to the application. The results show that the redundant computation introduced by OpenMP parallel programming is not significant. The serial portion and the load imbalance significantly affect the parallel efficiency. And memory access bandwidth significantly affects the achieved floating point performance. This paper also compares the performance differences between two archi-tectures and discusses the directions of further performance tuning.关键词
多核/集成众核/CFD应用程序/OpenMP/性能分析Key words
multicore/many integrated core/CFD application/OpenMP/performance analysis分类
信息技术与安全科学引用本文复制引用
车永刚,张理论,王勇献,徐传福,程兴华..Intel多核与集成众核上CFD程序的OpenMP性能分析[J].计算机科学与探索,2015,(10):1153-1162,10.基金项目
The National Natural Science Foundation of China under Grant Nos.60603055,11272352(国家自然科学基金) (国家自然科学基金)