

Parallel computing performance of distributed hydrological model accelerated by GPU


针对具有物理机制的分布式水文模型对大流域、长序列模拟计算时间长、模拟速度慢的问题,引入基于GPU的并行计算技术,实现分布式水文模型WEP-L(water and energy transfer processes in large river basins)产流过程的并行化.选择鄱阳湖流域为实验区,采用计算能力为 8.6的NVIDIA RTX A4000对算法性能进行测试.研究表明:提出的基于GPU的分布式水文模型并行算法具有良好的加速效果,当线程总数越接近划分的子流域个数(计算任务量)时,并行性能越好,在实验流域WEP-L模型子流域单元为 8 712个时,加速比最大达到 2.5左右;随着计算任务量的增加,加速比逐渐增大,当实验流域WEP-L模型子流域单元增加到 24 897个时,加速比能达到3.5,表明GPU并行算法在大尺度流域分布式水文模型计算中具有良好的发展潜力.

With the development of distributed hydrological models towards larger watersheds and finer granularity,computational efficiency gradually became a bottleneck,and parallel computing technology emerged as an effective solution to this challenge.In the realm of parallel computing for distributed hydrological models,most of the existing studies have primarily focused on CPU-based parallel techniques,with relatively limited research on GPU-based parallel methods.Furthermore,investigations on distributed hydrological models incorporating physical mechanisms remain scarce. This study centered around the physically-based distributed hydrological model WEP-L(water and energy transfer processes in large river basins)and explored the utilization of GPU-based parallel computing techniques.From a spatial perspective,the WEP-L model divides the watershed into numerous sub-basin units,where each unit's runoff calculations are independent,offering spatial parallelism.The interdependencies between simulation units were taken into account while allocating jobs to several computer units for parallel execution.Consequently,the runoff process of the model was parallelized based on sub-basins,dividing the Poyang Lake basin into 8,712 sub-units,and employing GPU threads to execute parallel computations through kernel functions. It is founded that the distributed hydrological model's suggested GPU-based parallel approach significantly accelerated the process.With an increase in GPU thread count,the parallel computing time steadily reduced.The parallel performance was most efficient when the total thread count closely approached the number of divided sub-basins.In the experimental Poyang Lake basin with 8,712 sub-basin units in the WEP-L model,the maximum speedup reached around 2.5.Secondly,the performance of GPU parallel computing was influenced not only by the degree of parallelism but also by the computational workload.With an increase in computational workload,both serial and parallel computation times increased.However,due to the smaller rate of increase in parallel computation time compared to the serial method,the speedup gradually increased,albeit at a diminishing rate.When the number of sub-basin units in the experimental WEP-L model increased to 24,897,the speedup ratio reached 3.5,indicating the considerable potential for GPU parallel algorithms in the computation of physically-based large-scale watershed distributed hydrological models. In conclusion,GPU-based parallel algorithms showed great promise for computing large-scale,physically-based,watershed-distributed hydrological models.The results indicated that the enhancement of parallel efficiency was contingent not only on the number of parallel threads activated but also on the size of the computational workload.The parallel calculation time decreased gradually as the number of GPU threads rose.As the computing demand rose,the speedup ratio increased correspondingly.GPU-based parallel computing represents the current trend in parallel computing.This study could provide valuable experience for other researchers exploring GPU parallel algorithms,contributing to the facilitation of interdisciplinary collaboration between computer science and water resources engineering.


黑龙江大学水利电力学院,哈尔滨 150080||中国水利水电科学研究院流域水循环模拟与调控国家重点实验室,北京 100038中国水利水电科学研究院流域水循环模拟与调控国家重点实验室,北京 100038黑龙江大学水利电力学院,哈尔滨 150080



GPU-based parallel algorithmphysical mechanismdistributed hydrological modelWEP-L modelcomputational performance

《南水北调与水利科技(中英文)》 2024 (001)

33-38 / 6


