PreNTT:面向zk-SNARK的数论变换计算并行加速方法OA北大核心CSTPCD
PreNTT:parallel acceleration method for number theory transformation calculations for zk-SNARK
简洁非交互式零知识证明(zk-SNARK)由于具备证明验证过程简捷快速的优点,已在加密货币等众多领域得到广泛应用.但其证明生成过程所需计算仍复杂耗时,影响了进一步的应用拓展.针对zk-SNARK证明生成过程中的主要计算瓶颈——数论变换(NTT),提出了一种基于GPU的NTT计算加速方法PreNTT.首先,提出了基于预计算的NTT并行计算方法,利用预计算与旋转因子次幂算法优化,减少NTT并行计算开销,并结合动态预计算,进一步提高NTT计算效率.其次,通过"动态自适应计算核调度",可以根据NTT输入规模自适应地分配GPU片上资源,提升了大规模NTT任务的计算能效.然后,通过核外整体数据混洗和核内局部数据混洗相结合的方式,避免了访存冲突.最后,使用CUDA多流技术执行数据传输和计算过程,对预计算时间进行了有效隐藏.实验结果表明:基于PreNTT实现的zk-SNARK系统,与目前业界最先进的系统Bellperson相比,NTT模块运行时间获得了全规模最低1.7倍的加速比,最高加速比为9倍.PreNTT能够有效提高NTT算法并行度,降低zk-SNARK运算时间开销.
Zero-knowledge succinct non-interactive proofs(zk-SNARK)have found extensive applications in various fields,in-cluding cryptocurrencies,due to their swift and efficient proof verification process.However,the computational intensity of the proof generation process poses a significant challenge,particularly at the number theoretic transform(NTT)stage.This paper proposed a GPU-based acceleration method for NTT computations,named PreNTT,to address this bottleneck.The method em-ployed precomputation and optimization of twiddle factor powers to reduce the parallel computation overhead in NTT.It also in-troduced dynamic precomputation to enhance the efficiency of these computations.The algorithm made use of dynamic adaptive kernel scheduling,which allocated GPU resources on-chip according to the NTT input size,thereby boosting the computational efficiency for large-scale tasks.Additionally,the approach combined external global data shuffling with internal local data shuff-ling to avoid memory access conflicts.The use of CUDA multi-stream technology allowed for effective concealment of precompu-tation times during data transfer and computation processes.Experimental results indicate that the zk-SNARK system utilizing PreNTT achieves a speed-up ratio ranging from 1.7x to 9x in NTT module running times compared to Bellperson,the industry-leading system.PreNTT effectively increases the parallelism of the NTT algorithm and reduces the computational overhead in zk-SNARK operations.
丁冬;李正权;柴志雷
江南大学物联网工程学院,江苏无锡 214122江南大学物联网工程学院,江苏无锡 214122||北京邮电大学 网络与交换技术全国重点实验室,北京 100876江南大学人工智能与计算机学院,江苏无锡 214122
计算机与自动化
简洁非交互式零知识证明数论变换GPU并行计算加速
zero-knowledge succinct non-interactive argument of knowledgenumber theoretic transformationGPUparal-lel computingacceleration
《计算机应用研究》 2024 (010)
3059-3067 / 9
国家自然科学基金资助项目(61972180);北京邮电大学网络与交换技术全国重点实验室开放课题资助项目(SKLNST-2023-1-13)
评论