Abstract:Wide-width bit permutation is a very commonly used operation in symmetric cryptographic algorithms. However, current word-oriented general microprocessors are inefficient to cope with the complex bit-level permutation operations. To solve this problem, two schemes for 2N-2N and kN-kN permutations are proposed respectively, including two extended instructions BEX and BEX-ROT. Furthermore, the efficient hardware implementation of the instructions are studied, and then a unified hardware circuit named RERS (Reconfigurable Extract and Rotation Shifter) is proposed with a corresponding reconfigurable routing algorithm. The RERS can share hardware resources to achieve the purpose of reducing area. The experimental results show that the proposed schemes can truly decrease the number of instructions for accomplishing an arbitrary wide-width bit permutation (instructions reduced by 10 times), which greatly accelerate the performance of microprocessors. At the same time, the overhead of hardware resources and delay caused by the two extended instructions is very low, which will not affect the normal operating frequency of the original microprocessors.
戴紫彬,马超,李伟,南龙梅. 面向密码算法的大位宽比特置换操作高速实现方案[J]. 电子与信息学报, 2017, 39(9): 2119-2126.
DAI Zibin, MA Chao, LI Wei, NAN Longmei. Wide-width Bit Permutation Instructions for Accelerating Cryptographic Algorithms. JEIT, 2017, 39(9): 2119-2126.
SHAN Weiwei, CHEN Xin, LU Yinchao, et al. A novel combinatorics-based reconfigurable bit permutation network and its circuit implementation[J]. Chinese Journal of Electronics, 2015, 24(3): 513-517. doi: 10.1049/cje.2015.07. 013.
[2]
AO T, HE Z, and DAI K. Low-cost bit permutation circuit with concise configuration rule[C]. Proceedings of the International MultiConference of Engineers and Computer Scientists, Hong Kong, 2015: 158-160.
[3]
JOLFAEI A, WU X, and MUTHUKKUMARASAMY V. On the security of permutation-only image encryption schemes[J]. IEEE Transactions on Information Forensics and Security, 2015, 11(2): 235-246. doi:10.1109/TIFS.2015.2489178.
[4]
LI W, YU F, and MA Z. Efficient circuit for parallel bit reversal[J]. IEEE Transactions on Circuits & Systems II Express Briefs, 2016, 63(4): 381-385. doi: 10.1109/TCSII. 2015.2504943.
[5]
RAVAL N, BANSOD G, PISHAROTY D N, et al. Implementation of efficient bit permutation box for embedded security[J]. WSEAS Transactions on Computers, 2014(13): 442-451.
[6]
BANSOD G, GUPTA A, GHOSH A, et al. Experimental analysis and implementation of bit level permutation instructions for embedded security[J]. WSEAS Transactions on Information Science & Applications, 2013, 10(9): 303-312.
[7]
SHIBUTANI K, ISOBE T, HIWATARI H, et al. PICCOLO: An ultra-lightweight blockcipher[C]. Cryptographic Hardware and Embedded Systems-CHES 2011, Nara, 2011: 342-357. doi: 10.1007/978-3-642-23951-9_23.
[8]
BOGDANOV A, KNUDSEN L R, LEANDER G, et al. PRESENT: An ultra-lightweight block cipher[J]. Lecture Notes in Computer Science, 2007, 4727: 450-466. doi: 10.1007 /978-3-540-74735-2_31.
[9]
MINIER M and GILBERT H. Stochastic cryptanalysis of crypton[C]. FAST Software Encryption, International WorkShop, FSE 2000, New York, 2000: 121-133. doi: 10.1007 /3-540-44706-7_9.
[10]
BIHAM E, ANDERSON R, and KNUDSEN L. SERPENT: a new block cipher proposal[J]. Lecture Notes in Computer Science, 1998, 1372: 222-238. doi: 10.1007/3-540-69710- 1_15.
[11]
CHENG H, HEYS H M, and WANG C. PUFFIN: A novel compact block cipher targeted to embedded digital systems[C]. Euromicro Conference on Digital System Design Architectures Methods and Tools, Parma, 2008: 383-390. doi: 10.1109/DSD.2008.34.
[12]
HILEWITZ Y and LEE R B. Fast bit gather, bit scatter and bit permutation instructions for commodity microprocessors [J]. Journal of Signal Processing Systems, 2008, 53(1):145-169. doi: 10.1007/s11265-008-0212-8.
[13]
KOLAY S, KHURANA S, SADHUKHAN A, et al. PERMS: A bit permutation instruction for accelerating software cryptography[C]. Euromicro Conference on Digital System Design, Los Alamitos, 2013: 963-968. doi: 10.1109/DSD.2013. 109.
[14]
SANGEETHA M and JAGADEESWARI M. Design and implementation of new lightweight encryption technique[J]. International Journal of Innovative Research in Science Engineering and Technology, 2016, 5(3): 8610-8617.
CHANG Zhongxiang, DAI Zibin, LI Wei, et al. Bit permutation based on interconnection network[J]. Computer Engineering and Design, 2014(8): 2640-2644. doi: 10.3969/ j.issn.1000-7024.2014.08.004.
[16]
SHI Z J. Bit permutation instructions: Architecture, implementation, and cryptographic properties[D]. [Doctoral dissertation]. Princeton University, 2004.
[17]
HILEWITZ Y and LEE R B. A new basis for shifters in general-purpose processors for existing and advanced bit manipulations[J]. IEEE Transactions on Computers, 2009, 58(8):1035-1048. doi: 10.1109/TC.2008.219.
[18]
SAYILAR G and CHIOU D. CRYPTORAPTOR: High throughput reconfigurable cryptographic processor[C]. IEEE /ACM International Conference on Computer-Aided Design, San Jose, 2014: 155-161. doi: 10.1109/ICCAD.2014.7001346.
[19]
BENHADJYOUSSEF N, ELHADJYOUSSEF W, MACHHOUT M, et al. Enhancing a 32-bit processor core with efficient cryptographic instructions[J]. Journal of Circuits, Systems & Computers, 2015, 24(10): 1550158-1550178. doi: 10.1142/S0218126615501583.
HU Min, LU Yongjiang, and LIU Bing. Assembly and link time optimization based on CK810 processor[J]. Computer Engineering, 2014, 40(11): 250-254. doi: 10.3969/j.issn.1000- 3428.2014.11.050.
[21]
ARM corporation. Cortex®-A8 processor [OL]. http://www. arm.com/zh/products/processors/cortex-a/cortex-a8.php, 2016.10.
[22]
LIU B and BAAS B M. Parallel AES encryption engines for many-core processor arrays[J]. IEEE Transactions on Computers, 2013, 62(3): 536-547. doi: 10.1109/TC.2011.251.