基于寄存器的SM4软件优化实现方法OA北大核心CSTPCD
Optimization Implementation Method of SM4 Based on Register
SM4算法的实现效率是密码算法国产化进程中亟需解决的关键问题,许多学者致力于研究如何提升SM4算法的实现速度.比特切片是目前SM4算法软件实现方法中公认速度较高的一种实现方法,它通过在一次加密运算中并行加密多组明文数据的方式,在处理大批量数据时显著提高了 SM4算法的实现速度.使用该方法时,每次加密运算前需一次性将多组数据的相同位加载到CPU的寄存器中,由此会带来CPU寄存器与内存之间数据传输的时间开销.为了减小寄存器一次性加载数据的规模,本文对于比特切片方法中的数据编排方式进行了改进,使得每次CPU执行运算时只加载必要的运算数据,从而减少了内存与寄存器之间的交互操作,进一步提高了用比特切片方法实现SM4时的整体加密效率.采用改进后的比特切片方法实现了 SM4算法的64组数据并行加解密,该方法的理论加解密速度可达4.1 cycles/byte,经测试在AMD Ryzen 7 5800H平台上加密速率达到了 11162 Mb/s.该方法对基于比特切片方法设计的对称加密算法软件优化实现方法具有重要参考价值.
The efficiency of the SM4 is a fundamental issue that needs to be solved urgently during the development of national cryptographic algorithms.Many scholars have devoted themselves to studying how to improve the implementation speed of the SM4 algorithm.Bit-slicing is currently recognized as one of the fastest software implementation method of SM4 algorithm.It improves the implementation speed of the SM4 algorithm by encrypting multiple sets of data in parallel in one encryption operation.However,the same bits of multiple sets of data need to be loaded into the registers of the CPU at one time before each encryption operation,which will bring the extra overhead of data transmission between the CPU registers and the memory.In order to reduce the data loaded by registers at one time,this study improves the data arrangement method in the bit slicing method,so that only the necessary operation data is loaded each time when CPU performs the operation.It reduces the interaction between memory and registers,and further improves the overall encryption efficiency when using the bit slicing method to implement SM4.This study adopts the improved bit slicing method to realize the parallel encryption and decryption of 64 groups of data of SM4 algorithm.The encryption and decryption speed of this method can reach 4.1 cycles/byte in theory,and reached 11 162 Mb/s on the AMD Ryzen 7 5800H platform.This method has important reference value for the software optimization implementation method of the symmetric encryption algorithm designed based on the bit slicing method.
陈晨;郭华;刘源灏;龚子睿;张宇轩
复杂关键软件环境全国重点实验室,北京 100191||北京航空航天大学网络空间安全学院,北京 100191北京航空航天大学网络空间安全学院,北京 100191
计算机与自动化
SM4算法比特切片内存读取数据编排
SM4 algorithmbit-slicingmemory read operationsdata arrangement
《密码学报》 2024 (002)
427-440 / 14
北京市自然科学基金(4202022);复杂关键软件环境全国重点实验室自主课题(CCSE-2024ZX-06);校级大学生创新创业训练计划(X202210006242)Natural Science Foundation of Beijing Municipality(4202022);State Key Laboratory of Complex& Critical Software Environment(CCSE-2024ZX-06);Innovation and Entrepreneurship Training Plan for College Students(X202210006242)
评论