Gang Adar I 2015

IEEE INDICON 2015 1570201503
1
2
3
4
FPGA Implementation of Compact S-Box for AES
5
6
Algorithm using Composite Field Arithmetic
7
8 Bhoopal Rao Gangadari and Shaik Rafi Ahamed
9 Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati,
10 Assam-781039, India
11 Email:(bhoopal, rafiahamed)@iitg.ernet.in
12
Abstract— This paper present method for constructions literature. The optimized construction of S-Box is presented
13
of the S-Box for Advanced Encryption Standard (AES) using various architectures[4, 14]. Moreover, the proposed
14
algorithm using composite field arithmetic in GF . It architecture of Composite Field Arithmetic is used as a
15 replacement to the traditional LUT S-Box which results in
is advantageous to implement the composite field arithmetic
16 GF 2 is decomposed into GF 2 and GF 2
(CFA) on Field Programmable Gate Array for AES algorithm.
17 Moreover, architectural implementation of S-Box using CFA respectively[7]. The construction of S-Box using GF 2
18 reduces the hardware in terms of gates count as compared with with pipelining structure is implemented on hardware [1, 15].
19 that classical Look Up Table based S-Box for AES algorithm. However, the proposed architecture of S-Box using CFA
20 The composite fields are constructed by decomposition reduces the hardware in terms of gates and also reduces
21 methodology. However, an isomorphic mapping function to power consumption for the AES algorithm. The CFA scheme
22 map the GF representation to its composite field in illustrated in this paper used for computing the multiplicative
23 GF and eight such mappings exist for each inverse in GF 2 for S-Box. In this paper, we propose 16
24 construction. The proposed CFA based S-Box implementation ways of construction of S-Box using CFA in GF 2 .
on hardware provides less area by 50% and low power However, experimental results are also provided which
25
consumption compared to the classical S-Box. demonstrates how the subfield operations are affected by the
26
27 coefficients of the irreducible polynomial. Moreover, there
Keywords— Composite Field Arithmetic (CFA), Look Up are eight isomorphic mappings for the construction of S-Box
28 Table (LUT), Substitution Box (S-Box), Field Programmable
in order to map the elements from GF 2 to GF 2
29 Gate Array (FPGA).
in each construction.
30
31 I Introduction The paper is organized as follows. Section II The concept
32 of AES algorithm is revisited Section III describes the
33 Today, millions of secure data transmissions depend on different constructions methods for the composite field
34 encryption which clearly signifies the vital role played by GF 2 . Section IV The influence of Sub Field
35 cryptography in the modern world. With the development of operations on each block in construction of S-Box using
36 computing technology, a stronger cryptographic algorithm is CFA. Section V Comparison of hardware implementation of
necessary for secure transmission of information. Federal Composite Field Arithmetic on Field Programmable Gate
37
Information Processing Standards (FIPS) standardized Data Array with that of LUT based S-Box and Section VI winds
38 Encryption Standard (DES) in 1999 [11]. However, it was up this paper with conclusion.
39 vulnerable to cryptographic attacks[17]. Moreover, in 2001
40 the Advanced Encryption Standard (AES) was standardized II Advance Encryption Standard
41 by the National Institute of Standards and Technology
42 (NIST)[18]. AES is now accepted widely as a secure The flow of encryption and decryption for AES algorithm
43 encryption and decryption standard for encrypting almost all is as shown in Fig. 1. and the secret key is break down into
44 forms of electronic data such as data used in banking, total number of 16 bytes, where each byte is considered as a
45 telecommunications, health care information systems and element of GF 2 . The irreducible polynomial
federal information. The AES algorithm has widely used for 1 is the one prescribed for the AES
46
ultra low power portable devices like body implants, smart algorithm in GF 2 for the construction of LUT based
47
cards, cellphones, Bluetooth, RFID tags etc. However, the S-Box. There are three different types of cipher keys used for
48 AES algorithm. For 128 ,192, 256 bits secret key number of
49 proposed scheme emphasizes mainly on less hardware and
high throughput for AES algorithm. Several architectures of rounds = 10 rounds, = 12 rounds and = 14
50 rounds of encryption. The input bits are arranged in 4 4
the AES algorithm was implementation on FPGA and its
51 matrix of bytes knows as state array. Each column as well as
performance is evaluated[9, 10, 13]. Among these design, in
52 order to enhance the throughput, the concept of sub row is known as a word (W). Each round of processing will
53 pipelining and pipelining architecture are implemented on have one substitution step using S-Box know as SubBytes
54 hardware[6]. Various architectural construction of the S-Box (SB), a row-wise transformation know as ShiftRow (SR), a
55 has been proposed for the AES algorithm [2, 5, 8, 16]. column-wise mixing known as MixColumn (MC) and the
56 However to the best of our knowledge the critical path and addition of the Addroundkey (ARK). The decryption process
57 power consumption issues are not considered in the consists of the inverse SB, inverse SR, inverse MC, ARK.
60
61
62
63 978-1-4673-6540-6/15/$31.00 ©2015 IEEE
64
65
1
transformed state. For AES algorithm, considering 128 bit
key, 0 to 9 rounds consistts of SB, SR, MC and ARK
transformations. The last 10
0 round doesn’t contain MC
transformation.
D. AddRoundKey
The round cipher key geenerated from key expansion by
bitwise XOR operation. Afteer key Schedule the key can be
divided into eleven groups of
o four words. The first 4 words
are involved in the direct XO
OR operation with the plain data
and rest of the 10 round usees subsequent 4 words for ARK
process.
E. Key Expansion
As the data in AES algoriithm is transformed into a matrix
form before transformations. The 44 words are derived in this
key expansion phase from 4 column words of the secret key.
The key schedule each roundd uses 4 words. This is known as
Key Expansion. In a similar way, when the key is 128 bits,
192 bits, 256 bits used for AES
A algorithm and subsequently
number of bytes in each row also
a vary, denoted by can be
4, 6, or 8 respectively.
III Construction of Compposite Field Arithmetic
Figure 1: Sequential Flow of AES Enncryption The S-Box used in the AES algorithm consists of the
following two steps the multiiplicative inversion over GF 2
Each round of the processing will take the input matrix and followed by affine transformation. Mathematically the
and produce an output matrix using the key generated from S-Box is represented as
key expansion round. The output state matrrix produced by
the last round will be arranged in the form off a 128 bit block (1)
which is the encrypted output. Fig. 1 show ws the sequential where is an 8 8 biinary matrix and is an 8-bit
flow of AES Encryption. binary vector. Computation of multiplicative inverse over
A. SubBytes GF 2 is a complex task and a consumes lot of hardware.
This can be simplified by usinng composite field arithmetic. In
It is the most important step where a trransformation is this, GF 2 field is brokenn down into lower order fields
performed on each byte of the 8 8 matrix using using irreducible polynom mials of degree two which
look-up-tables (LUT’s). Here each byte of daata is substituted drastically reduces the gate count and thus reduces power
with another byte from the LUT. These S-Box are designed consumption. The irreducible polynomials used in the
by the multiplicative inverse of each elem ment in the state composite field arithmetic arre as follows:
using GF 2 with a irreducible polynomiaal
1 followed by an affine transformation. The 2 2 1 (2)
S-Box (LUT’s) are designed using memory. This
transformation provides confusion to the blocck cipher.
2 2 (3)
B. ShiftRows
The transformation is used to create difffusion in cipher
text. It actually shifts the elements in a circuular manner. The 2 2 (4)
bytes in the first row of the matrix is not shhifted while the The values of the constannts and must be chosen
second, third and fourth row shifts left by onee byte, two byte,
carefully to ensure that the polynomials and
and three byte respectively.
remain irreducible respectiveely. It can be found out that there
C. MixColumns are two values of and eiight values of that make the
respective polynomials irreduucible. Thus altogether we have
This transformation is also responsible for creating 16 combinations of , using which we can construct the
diffusion in the block cipher. This is a collumn operation, composite field 2 for S-Box. The values of
where each column is expressed as a four term
t polynomial and are listed below:
equation over GF 2 field and is multiplied by
03 01 01 02 modulo 1 to give a 1,0 1000 , 11
100 1001 , 1101 (5)
2
1,1 1010 , 1110 10111 , 1111 (6) IV Hardware Construction off CFA in Galois Field for S-Box
Out of these 16 possible ways of construuction we have
selected 4 optimum values of and whicch has a smaller A. Squarer in GF Block
gate count. These optimum pairs , are as
a follows: This block is simpllified by taking the four bit input
GF 2 as , , , and for 1,0 each
1,0 1100 , 1,1 10
000 (7) bit in is shown in Fig. 3. and mathematically represented
by
1,0 1111 , 1,1 10
010 (8) ,
,
For computing the multiplicative inverse, composite field , (10)
arithmetic cannot be applied directly to a polynomial in
GF 2 . First we have to map each element of o GF(2 to its
composite fields via an isomorphic maapping function The bit expressions for the saame, taking 1,1 is given
. An isomorphic mapping functtion is a function by
which makes sure that each element of one field
f is uniquely ,
mapped to the other field and vice-versa presserving the field ,
operations. The matrix is decided by the irreducible , (11)
polynomials of GF 2 and its compositee fields. The
matrix is given below
1 0 1 0 0 0 0 0
1 1 0 1 1 1 1 0
1 0 1 0 1 1 0 0
1 0 1 0 1 1 1 0
1 1 0 0 0 1 1 0
1 0 0 1 1 1 1 0
0 1 0 1 0 0 1 0
0 1 0 0 0 0 1 1
when multiplied with the input byte
, , , , , , , , this reducees to an array of
XOR gates shown below. Then, multiplicattive inversion is
carried out in the composite fields. An element in the Figure 3: Squarer in GF 2
composite field GF 2 can be expressedd as and
its multiplicative inverse using the Extennded Euclidean B. Multiplier in GF Block
Algorithm is given by the following equationn. As we can see in Fig. 5,5 this block itself contains two
blocks namely X block as shown in Fig. 7 and the GF
(9) 2 multiplier block. Thhus, the complexity of this block
depends on the value of the constant multiplier . This block
is simplified by decompossing the field into GF 2
which implies two bit multtiplications instead of four bit.
Furthermore, it is simplifiedd by decomposing the field to
GF 2 where one bit multtiplications is performed. These
field reductions is done usiing the irreducible polynomials
.
C. X Block in GF
This block depends on both the values of and .
This block can also be simpplified by working in GF 2
and then to GF 2 . The bit expressions for the optimum
pairs of and are illustrated mathematically and depicted
in Fig. 6.
1,0 11100 ,
Figure 2: Composite Field Arithm metic ,
where = most significant nibbble and = the ,
(12)
least significant nibble The overall architecture
a for ,
calculating the multiplicative inverse in GF 2 is shown
in Fig. 2.
3
1,1
,
(17)
Fig 4: Multiplier in GF 2
Figure 6: Constant multiplier (x )
Figure 5: Multiplier in GF 2
Figure 7: Constant multiplier (x )
1,0 1111 ,
, E. Inversion in GF 2 Bloock
,
(13) There are several methodds for the implementation of this
, inversion in GF 2 block. Moreover, this block depends
only on . The direct computation approach is used in this
paper. However, it is an opttimum way of construction as it
requires low complexity in haardware. The bit expressions for
1,1 1000 , this approach referring the input GF 2 as
, , , , is given as foollows
,
(14)
, 1,0
,
,
1,1 1010 , , (18)
,
,
(15)
,
1,1
D. x Block ,
This block multiplies the 2-bit input too a constant . ,
Taking the input GF 2 as , . The bit
expressions of , for 1,1 , 1,0 are (19)
,
given as
1,0
,
(16)
Once the multiplicative innverse is calculated, it is mapped
back to its respective elemment in GF 2 by an inverse
4
isomorphic function and the corresponding and achieved low hardware utilization for 1,0 ,
matrix is given below. 1100 and low power consumption. In this paper, we also
observed that the coefficients of the irreducible polynomial
1 1 1 0 0 0 1 0
influences the isomorphic mappings and sub field operations.
0 1 0 0 0 1 0 0 Moreover, we also found that the constructions of S-Box by
0 1 1 0 0 0 1 0 CFA using irreducible polynomials reduces the number of
0 1 1 1 0 1 1 0 gates count. Moreover, the hardware realization achieved
0 0 1 1 1 1 1 0 less area and power consumption compared to the Classical
1 0 0 1 1 1 1 0 S-Box of AES algorithm.
0 0 1 1 0 0 0 0
0 1 1 1 0 1 0 1 References
Affine transformation is multiplication by a matrix
[1] National Institute of Standards and Technology, FIPS PUB 46-3: Data
followed by addition of a vector. Encryption Standard (DES). October 1999. supersedes FIPS 46-2.
[2] H.D. Zodpe, P.W. Wani, and R.R. Mehta, Design and implementation
V Hardware Implementation of Composite Field of algorithm for DES cryptanalysis, in 2012 12th International
Arithmetic Conference on Hybrid Intelligent Systems (HIS),,pages 278–282, Dec 2012.
The theoretical validation of CFA for AES S-Box is [3] Advanced Encryption Standard (AES), Federal Information Processing
Standards Publication 197 Std., November 26 2001.
implemented and verified with number of test vectors on [4] K. Jarvinen, M. Tommiska, and J. Skytta, “Comparative survey of high
FPGA (XC2VP30) Virtex-II pro board using Xilinx ISE tool. performance cryptographic algorithm implementations on FPGAs, ” IEE
The proposed architectural design of CFA based S-Box is Proceedings Information Security, vol. 152, no. 1, pp. 3–12, Oct 2005.
implemented using verilog for four optimum cases of , [5] K. Shesha Shayee, J. Park, and P. Diniz, “Performance and area
values as shown in Table I. The CFA design implementation modeling of complete FPGA designs in the presence of loop
transformations,” in 11th Annual IEEE Symposium on field-Programmable
consists of multiplication units, inversion units , isomorphic Custom Computing Machines, 2003, FCCM 2003, April 2003.
mapping ( ), squarer, inverse isomorphic mapping ( ) and [6] C. Nalini, Nagaraj, P. V. Anandmohan, D. V. Poornaiah, and V.
affine transformation as shown in Fig.2. As compared with D.Kulkarni, “An FPGA Based Performance Analysis of Pipelining and
various architectural implementation of S-Box [1, 3, 12, 15] Unrolling of AES Algorithm,” in International Conference on Advanced
are shown in Table I,however there is reduction in number of Computing and Communications, ADCOM 2006, Dec 2006,pp. 477–482.
[7] A. Hodjat and I. Verbauwhede, “Area-throughput trade-offs for fully
gates. Moreover, there is decrease in the area of the proposed pipelined 30 to 70 Gbits/s AES processors,” IEEE Transactions on
design compared to LUT based S-Box by 50% and hence Computers, vol. 55, no. 4, pp. 366–372, April 2006.
there is also decrese in power consumption. The optimum [8] J. M. Granado-Criado and M. A. Vegaz, “A new methodology to
values of 1,0 1100 has the least gate count in implement the AES algorithm using partial and dynamic reconfiguration,”
hardware and low power consumption compared with The {VLSI} Journal on Integration, vol. 43, no. 1, pp. 72 – 80, 2010.
[9] X. Zhang and K. Parhi, “High-speed VLSI architectures for the AES
remaining realizations. algorithm,” IEEE Transactions on Very Large Scale Integration (VLSI)
Table 1: Hardware Implementation of CFA for AES Systems, vol. 12, no. 9, pp. 957–967, Sept 2004.
[10] L. Ali, I. Aris, F. S. Hossain, and N. Roy, “Design of an ultra-high
Algorithm speed {AES} processor for next generation {IT} security ,” Computers &
Architecture using Area Power Electrical Engineering, vol. 37, no. 6, pp. 1160 – 1170, 2011.
LUT S-Box Gates (W) [11] K. Jankowski and P. Laurent, “Packed AES-GCM Algorithm Suitable
Standard Cells Wong et al. [16] 174 - for AES/PCLMULQDQ Instructions,” IEEE Transactions on Computers,
UMCL18G212 vol. 60, no. 1, pp. 135–138, Jan 2011.
ASIC D. Canright [17] 91 - [12] U. Waqas, S. Afzal, M. Mir, and M. Yousaf, “Generation of AES Like
S-Boxes by Replacing Affine Matrix,” in 2014 12th International
ASIC Satoh et al. [18] 166 - Conference on Frontiers of Information Technology (FIT), Dec 2014.
FPGA Proposed Architecture Slice/ [13] E. Ganesh, R. Velayutham, and D. Manimegalai, “A secure software
using CFA LUT implementation of nonlinear AES S-box with the enhancement of
XV2VP30(Ours) 1,0 1100 42 0.660 biometrics,”in 2012 International Conference on Computing, Electronics
and Electrical Technologies (ICCEET),March 2012, pp. 927–932.
XV2VP30(Ours) 1,1 1000 47 0.750 [14] N. Iyer, P. Anandmohan, D. Poornaiah, and V. Kulkarni, “High
XV2VP30(Ours) 1,1 1010 45 0.560 Throughput, low cost, Fully Pipelined Architecture for AES Crypto Chip,”
in 2006 Annual IEEE India Conference, Sept 2006, pp. 1–6.
XV2VP30(Ours) 1,0 1111 52 0.852 [15] S. Abdel-Hafeez, A. Sawalmeh, and S. Bataineh, “High Performance
AES Design using Pipelining Structure over GF((24)2)” in IEEE
International Conference on Signal Processing and Communications, 2007.
VI Conclusion ICSPC 2007, Nov 2007, pp. 716–719.
[16] M. Wong and M. Wong, “New lightweight AES S-box using LFSR,”
in 2014 International Symposium on Intelligent Signal Processing and
In this paper, a detail study of the construction of S-Box Communication Systems (ISPACS), Dec 2014, pp. 115–120.
[17] D. Canright, “A Very Compact S-Box for AES,” in Cryptographic
using CFA is proposed and validated on FPGA. Using Hardware and Embedded Systems CHES 2005, ser. Lecture Notes in
composite field arithmetic, the design of each block for Computer Science, J. Rao and B. Sunar, Eds. Springer Berlin Heidelberg,
S-Box with different values of and are taken into 2005, vol. 3659, pp. 441–455.
consideration. In this paper, we proposed 16 ways of [18] A. Satoh, S. Morioka, K. Takano, and S. Munetoh, “A Compact
construction of S-Box in CFA. Moreover, the optimum Rijndael Hardware Architecture with S-Box Optimization,” in Advances in
Cryptology ASIACRYPT 2001, ser. Lecture Notes in Computer Science, C.
values of and are considered for construction of S-Box Boyd, Ed. Springer Berlin Heidelberg, 2001, vol. 2248, pp. 239–254.

Gang Adar I 2015

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Gang Adar I 2015

Hochgeladen von

Copyright:

Verfügbare Formate

IEEE INDICON 2015 1570201503

III Construction of Compposite Field Arithmetic

Figure 6: Constant multiplier (x )

Das könnte Ihnen auch gefallen