Sie sind auf Seite 1von 11

!"#$%&'"()*+*&,%"'&'"-).!*/.'".)0.!*,&/&*'".)'"&*+*0.

')-%-12&*
*
+3*14567896:*.3*;7<=8>:*?3*;@>AB9:*#3'>5C>8:*-3*,988=DD=45=:*E3*;>A=>F:*#3*'9AA48>6>*G.C=B4A9FH*I*JKLM*!?#)"
04C4F*84F*C9A9DN4F*A9F9A<>C4F

DESIGN A GENERATION KEYS MODULE FOR R.S.A ENCRYPTION USING FPGA

Orellana, Rafael
Lacruz, Jess
rafael.orellana@ula.ve
jlacruz@ula.ve
Departamento de Electrnica y Comunicaciones, Universidad de Los Andes. Mrida-Venezuela
Coronel, Mara
maria.coronel@ula.ve
Departamento de Circuitos y Medidas, Universidad de Los Andes. Mrida-Venezuela

Abstract. The art of keeping messages secure is cryptography. The R.S.A algorithm is a high
quality, secure and asymmetric key algorithm used to provide data protection services. This
algorithm requires two different keys, one for encryption (public key) and the other for
decryption (private key). The computational load depends on the keys bits size and typically is
implemented using software-based programs. This paper presents the design and logic synthesis
of a module for keys generation of the R.S.A algorithm using hardware description language,
particularly, Verilog. The design includes functional and structural specification. Functional
specification shows the port list description and functional core in order to attend basic
functionality of the module. Structural description shows the proposed architecture for R.S.A
keys generation module. The data-path shows the implemented sub-modules and the
interconnection between them. Modular exponentiation submodule is designed to compute prime
numbers using Fermat test. The keys of R.S.A algorithm are computed using Euclidean Extended
Algorithm and stored in two registers. Random numbers used in the algorithm are generated
with linear feedback shift registers. The module is parametrized to generate 16, 64 and 128 bits
keys size. The correct data flow is checked by the control unit implemented as a finite state
machine. Test bench is designed to check the functionality of R.S.A generation keys module,
simulated using Verilog and Xilinx ISE 14.7. Test shows public key and private key computed
and the correct encryption and decryption task for different keys size. For the logic synthesis a
FPGA kit board Spartan-3E is used setting a clock frequency of 50 MHz. The final results show
singular improvements in the proposed architecture in terms of timing and area, and the
advantages of using a parameterizable design.

Keywords: Cryptography, R.S.A, Public key, Private Key, FPGA


.'!OYK*#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!

1. INTRODUCTION

Programmable hardware structures, especially Field Programmable Gates Arrays (FPGA)


devices, are useful to implement prototypes, offering high performance hardware at a reduced
cost in comparison to Application specific Integrated Circuits (ASIC). Currently, FPGA devices
are composed of complex functional blocks that are useful to implement complex algorithms [1].
In cryptography applications, it is possible to change the algorithm in terms of hardware, and
obtain good performance and the ability to be connected to high speed peripheral devices with
special function to !"!#$%!&'(#)&%(*+,-+,*$!!,+!&#).#&/,01#&).2$&#)$!$&3$.#4+$!5
Cryptography plays an important role in the security of data. It enables us to store sensitive
information or transmit it across insecure networks so that unauthorized persons cannot read it.
Encryption algorithms can be classified into two groups: symmetric key algorithms (with private
key algorithms) and asymmetric key algorithms (with public key algorithms) [2]. The
asymmetric key algorithm requires two different keys, one for encryption and other for
decryption.
Now, Rivest-Shamir-Adleman (R.S.A) algorithm is the most widely accepted and
implemented public key cryptosystem. It is based on different keys, one key for encryption
(public key) and a different but related key for decryption (private key) [3]. However, the R.S.A
algorithm has a large computational load, operating over large (typically thousands of bits long)
integers.
Several works have been done on hardware implementation of R.S.A encryption algorithm.
A hardware implementation of R.S.A encryption scheme has been proposed by Deng Yuliang
and Mao Zhigang. in [4], where they use Montgomery algorithm for modular multiplication. A
similar approach has been used by C.N. Zhang & Y. Xu. in [5]. This design scheme focuses on
the implementation of a R.S.A cryptographic processor using Bit-Serial Systolic Algorithm.
Other work was proposed and modeling of R.S.A public key encryption/decryption system for
128 bits key sizes using a FPGA [6]. This entire works were implemented considering that public
key and private key are known.
This paper presents an architecture to implement a generation keys module for R.S.A
algorithm using a FPGA. It uses a modular exponentiation module to calculate the prime
numbers of the algorithm using Fermat test. The R.S.A keys are computed using Euclidean
extended algorithm. Random numbers used in the algorithm are generated with linear feedback
shift registers (LFSR). The module is parametrized to generate 16, 64 and 128 bits key size. The
correct data flow is checked by the control unit implemented as a finite state machine.

2. R.S.A ALGORITHM

R.S.A is an encryption algorithm based on blocks. This means that both the plain text and
the cipher text are given number between 0 and (n-1). A message larger that log2(n) is divided
into segments of appropriate length, called blocks, which are encrypted one by one. Besides, as a
public key cryptographic algorithm, it is based on a mathematic related key pairs between public
key and a private key [7]. R.S.A algorithm is summarized in three main steps: keys generation,
encryption and decryption.

2.1 Keys generation


#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!*.'!OYL

60&#)(!&!#$-&#)$&-+(2.#$&.0/&-478(*&9$"!&.+$&:$0$+.#$/5&;<(:4+$&=> shows a flow diagram used


to calculate them.

Figure 1- Diagram flow to calculate R.S.A keys.

Keys generation for RSA starts with the selection of two prime numbers (p and q) which are
then multiplied to produce the publicly visible modulus n. The strength of R.S.A algorithm is
based on the difficulty of factoring n to discover the original prime numbers. Hence the larger
the value of these primes, the harder the factorization problem becomes. Then, is calculated
Euler function (n).
Next an integer, E, that is relatively prime to !"#, is randomly chosen as the public key. It
must satisfy that the Greater Common Divisor (GCD?&7$#'$$0&#)$%&(!&$@4.8&#,&;=>A&.0/&-478(*&
key is between the ranges from 1 to !"#. Private key, D, is generated finding the multiplicative
inverse using Euclidean extended algorithm (EEA).

2.2 Encryption

In R.S.A algorithm both, plain text (M) and cipher text (C), are blocks with length less than
log2(n) [7]. In encryption, the cipher text is genera#$/&7"&;B@5&C=?>5

C = M E mod n (1)

Where mod is the modulus operator between exponential operation ME and the number n.

2.3 Decryption

In R.S.A algorithm the clear message is recovered using the private key D 7"&.--8"(0:&;B@5&
CD?>A&')$+$&mod is the modulus operator between CD and the number n.

M = C D mod n (2)
.'!OYJ*#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!

Now, next section shows the design of a hardware module to implement the step of keys
generation in R.S.A algorithm.

3. R.S.A KEYS GENERATION MODULE DESIGN

The R.S.A keys generation module is designed to obtain public key and private key. It is
parametrized for key sizes ,3&;E>&7(#!&C=F&7(#!A&FG&7(#!&.0/&=DH&7(#!?5&I)$&/$!*+(-#(,0 of the port
interface is shown (0&;I.78$&=>5

Table 1. Port interfaces of R.S.A keys generation module

Port Name Direction Size Description


(bits)
clk_i Input 1 Clock control signal. All signal timings are related to the
+(!(0:&$/:$&,3&;*89>
reset_i Input 1 Asynchronous reset signal. It is active LOW and reset all
module
start_i Input 1 Used to start operation module. When HIGH the module
calculates R.S.A keys. LOW indicates that module does
not generate any key
public_key Output N Public key generated using R.S.A algorithm. The size is
parametrized for 16 bits, 64 bits, 128 bits
n_number Output N Product of two numbers primes generated using R.S.A
algorithm. The size is parametrized for 16 bits, 64 bits, 128
bits
valid_eu Output 1 Flag to indicate that module has generated a valid keys

3.1 Data-path design

;<(:4+$&D>&!),'!&#)$&/.#.-path proposed for R.S.A keys generation module. Control signals


come from control unit to correct data flow.

Figure 2- Data-path of R.S.A keys generation module.


#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!*.'!OYV

Two LFSR modules are used in the design to generate random numbers. The first one
generates random numbers in order to find if it is a prime number. A Modular Exponentiation
module is used to implement Fermat test [8]. When two prime numbers are calculated they are
stored in two registers (p_number and q_number). Then, Euler function and product of prime
numbers are calculated and stored in registers (fi_number and n_number). The second LFSR is
used to generate random public keys for Euclidean Extended module. When random public key
satisfies conditions of Euclides algorithm, public and private key are calculated and stored in
registers (public_key and private_key). Flag valid_eu is used to indicate that the public key
(n_number and public_key) and private key are computed satisfactorily.

LFSR modules design. Linear feedback shift register is used to generate random numbers.
In theory, an N-bit LFSR can generate 2N-1 bit long random sequence before repeating [9].
These module is parametrized to implement N-bit LFSR for random public key and (N/2)-bit for
the generation of prime numbers. Combinational logic using exclusive OR gates, AND gates and
shift operator are implemented in a feedback loop of LFSR. The structure of the N-bit LFSR is
!),'0&(0&;<(:5&J>5

Figure 3- LFSR module for N-bits

Modular exponentiation module. ;<(:4+$ G>& !),'!& #)$& ).+/'.+$& !#+4*#4+$& used for
modular exponentiation module.

Figure 4- Modular exponentiation module


.'!OYW*#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!

Modular exponentiation for large numbers is considerably difficult to compute. However,


this operation can be simplified into a series of modular multiplications and squaring operations
KLMK=NM5&60&#)(!&.8:,+(#)%&#)$&$O-,0$0#&04%7$+&;P>&(!&scanned either from Left to Right (LR) or
Right to Left (RL). In LR method, which is common used, if the scanned bit is logic ;zero> a
squared operation w(#)&7.!$&04%7$+&;Q> is performed. However if the scanned bit is logic ;on$>&
.& %48#(-8(*.#(,0& ,-$+.#(,0& 7$#'$$0& 7.!$& 04%7$+& ;Q>& .0/& $O-,0$0#& 04%7$+& ;P>& (!& computed.
This operation is performed k-times, ')$+$& ;9>& (!& #)$& modulus length [10][11]. Modulus
operation is implemented using restoring hardware dividers [12].

Euclidean extended module. Euclidean extended algorithm is an extension of Euclides


algorithm to compute GCD between two integer numbers (a and b) and the coefficients (x and z)
showed in ;B@5&CJ?>5 W)$0&RST&C.A7?&(!&$@4.8&#,&;=> #)$&*,$33(*($0#&;U>&+$-+$!$0#!&#)$&(02$+!$&
%48#(-8(*.#(2$&,3&;7>&04%7$+&K=JM5

a x + b z = GCD(a,b) (3)

This module takes as input Euler function and a random public key generated by the LFSR.
When the conditions shown (0& ;<(:5& =>& .+$& !.#(!3($/A& -+(2.#$& 9$"& (!& *,%-4#$/& .!& (02$+!$&
multiplicative of random public key and it is stored in registers. ;<(:4+$& V>& !),'!& #)$& -!$4/,&
code of Euclidean extended algorithm implemented in hardware.

Figure 5- Euclidean extended algorithm for compute private key

3.2 Control unit design

Control unit is implemented as a finite state machine (FSM) to control the data flow shown
(0&;<(:5&D>5&Q&!#.#$ diagram of FSM is shown (0&;<(:5&F>5 ;I.78$&D>&!),'!&#)$&-,+#&(0#$+3.*$&,3&
the FSM designed.
When input reset_i is enabled, all registers and modules of data-path go to initial conditions,
and FSM stayed in IDLE state. If reset_i signal is unable, the input start_i starts the operation of
the module. The GENERATE_PRIMES state is used to calculate two prime numbers. An internal
counter is implemented to check if two prime numbers are generated using Fermat test in the
data-path and then store the product of them in registers. Next, in CALCULATE_EULER state
#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!*.'!OYX

Figure 6- State diagram of Control Unit FSM

the Euler function is stored and then used to calculate the public and private keys.
ENABLE_LFSR_KEY state is used to generate a random number to be use as a possible public
key, CHECK_KEY state is verified if the random number meets the range of possible values for
the public key. Next, GENERATE_KEYS state is responsible to enable the module to calculate
public and private keys using Euclidean extended algorithm. Finally, in END state the keys are
stored in registers and activate a high flag to indicate the task is ended.

Table 2. Port interface of Control Unit FSM

Port Name Direction Size Description


(bits)
result Input N Indicates the result of modular exponentiation. It is used
with Fermat test.
done_exp Input 1 When HIGH indicates the end of modular exponentiation.
start_primes Input 1 It is used to enable FSM for calculate prime numbers.
counter Input 2 Indicates when two prime numbers are computed using
Fermat test.
fi_n_reg Input N Register used to store the Euler function.
random_key Input N Random number used as a possible public key in
Euclidean extended algorithm.
valid_eu Input 1 When HIGH indicates a valid private key is computed.
done_eu Input 1 When HIGH indicates the end task of Euclidean extended
module.
enable_prime Output 1 When HIGH enables LFSR for random numbers used to
compute prime numbers.
enable_exp Output 1 When HIGH enables modular exponentiation module.
enable_key Output 1 When HIGH enables LFSR for random numbers used as
possible public key.
enable_eu Output 1 When HIGH enables Euclidean extended module.
store_fi_n Output 1 When HIGH stores Euler function in a register.
valid_prime Output 1 When HIGH indicates a prime number is computed
.'!OYM*#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!

4. RESULTS

This section shows prime numbers computed using Fermat test, Euler function number and
Euclidean extended module performance to calculate public key and private key. The data-path
of all modules, control unit and test bench are described using Verilog language and simulated
using Xilinx ISE 14.7. A Spartan-3E board is used to synthesize the code. Results are shown in
hexadecimal numbers.

4.1 Prime numbers and Euler function simulation

Fermat test is implemented using %,/48.+&$O-,0$0#(.#(,0&%,/48$5&;<(:4+$&W> shows the test


bench simulation for 128-bits size to compute prime numbers (p_number and q_number), Euler
function number (fi_n_reg) and the product of prime numbers (n_number). ;I.78$& J>& -+$!$0#!&
these values for 16-bits and 64-bits size.

Figure 7- Time diagram for test bench to calculate prime numbers

Table 3. Prime numbers calculated for 16-bits and 64-bits size

N-bits size p_number q_number n_number fi_n_reg


16 95 29 17DD 1720
64 6CE74E9D 4EBD374D 21715B557FBF6039 21715B54C43ADA50

4.2 Euclidean extended module simulation

;<(:4+$& H>& !),'! the test bench for 128-bits public key and private key using Euclidean
extended algorithm. When GCD of fi_n_reg and random_key +$:(!#$+!& (!& ;=>, a private key is
calculated and valid_eu flag is high. Flag done_eu is high to indicate that the task of Euclidean
extended module is finished. ;I.78$&G>&!),'!&16-bits and 64-bits keys calculated.

Figure 8- Time diagram for test bench to calculate public and private key
#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!*.'!OYY

Table 4. Public and private key calculated for 16-bits and 64-bits size

N-bits size Public key Private key


16 05B5 02BD
64 0F07C000FDDA8A25 0AA9A4E7CB35E33D

4.3 Synthesis results

;I.78$& V>& !),'!& #)$ area (slices) and timing (maximum clock frequency) summary with
synthesis estimated values for R.S.A keys generation module. The FPGA chip used is a Xilinx
Spartan-3E xc3s500e-5fg320 with 50 MHz of clock frequency.

Table 5. Summary with estimated values for R.S.A keys generation module

N-bits size Number of Number of Maximum


Slices Slice Flip Flops Frequency
16 858 1103 79.18 MHz
64 7628 4167 40.09 MHz
128 30998 7917 28.49 MHz

It is clear that when the number of bits is increased, the maximum frequency achievable is
decreased, because it is more complicated to obtain prime numbers and encryption/ decryption
keys with the hardware architecture proposed. For 16-bits and 64-bits keys size the number of
slices allows the implementation of this module using the Xilinx FPGA, perhaps, 128-bits keys
size the slices used are over available, therefore, it is necessary to select other FPGA chip.

5. CONCLUSIONS

In this paper hardware architecture to generate public and private keys of R.S.A algorithm is
presented. Fast modular exponentiation module is used to implement a primality test to get two
prime numbers, Fermat test is used in this case. Euclidean extended algorithm is implemented in
hardware to solve a Diophantine equation with GCD of Euler function number and public key,
where its inverse multiplicative represents the private key. R.S.A generation keys module is
parametrized for 16, 64 and 128-bits keys size.
The proposed hardware architecture is implemented using Verilog targeting Xilinx Spartan-
3E xc3s500e-5fg320. The whole design is tested using Xilinx ISE 14.7 tool, showing the correct
functionality to obtain prime numbers, Euler function number and R.S.A keys. Synthesis result
shows that 128-bits size implementation require more than 100% resources of FPGA board
selected, so another FPGA chip board with best performance is necessary for hardware
implementation, however, 64-bits and 16-bits keys size are implemented satisfactorily using the
kit board selected.
Nowadays, for security reasons, commercial implementations requires 1024-bits and 2048-
bits keys size, so it is possible to employ the parameterizable characteristic of the proposed
hardware architecture to implement them, using advanced FPGA kit boards with specialized
blocks to reduce area and improve maximum frequency.
.'!OYZ*#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!

Acknowledgements

We would like to express our special thank and appreciation to the Electronic and
Communications department of Electrical Engineering School at University of Los Andes for
their support throughout this work.

REFERENCES

[1].Huffmire, T., Irvine, C., Nguyen, T., Levin, T., Kastner, R., & Sherwood, T., FPGA
Updates and Programmability, pp. 87-96. Springer Science - Basicness Media, 2010.

[2].Schneier, B., Applied Cryptography: Protocols, Algorithms, and Source Code in C, pp.
200-210. John Wiley & Sons, 1996.

[3].Rivest, R., Shamir, A., & Adleman, L., Method for Obtaining Digital Signatures and
Public-Key Cryptosystems. Communications of the ACM, vol. 21, n. 2, pp. 120X126,
1978.

[4].Deng Y., Mao Z., & Ye Y., Implementation of RSA Crypto-Processor Based on
Montgomery Algorithm. Fifth International Conference on Solid-State and Integrated
Circuit Technology, pp. 524-526, 1998.

[5].Zhang, C., Xu, Y., & Wu, C., A Bit-Serial Systolic Algorithm and VLSI Implementation
for RSA. IEEE Communications, Computers and Signal Processing, vol. 2, pp. 523-
526, 1997.

[6].Sushanta, K., & Manoranjan, P., FPGA Implementation of RSA Encryption System.
International Journal of Computer Applications, vol. 19, n. 9, pp. 10-12, 2011.

[7].Prasu, G., Malabika, B., & Biswas, M., Hardware Implementation of TDES Crypto
System with On Chip Verification in FPGA. Journal of Telecommunications, vol. 1, n.
1, pp. 113-117, 2010.

[8].Henk, C., & Sushil, J., Encyclopedia of Cryptography and Security, pp. 455-456.
Springer, 2011.

[9].Chiranth, E., Chakravarthy, H., Nagamohanareddy, P., Umesh, T., & Chethan, M.,
Implementation of RSA Cryptosystem Using Verilog. International Journal of
Scientific & Engineering Research, vol. 2, n. 5, pp. 1-7, 2011.

[10].Muhammad, I., Mamun, B., Reaz, H., & Sazzad, H., FPGA Implementation of RSA
Encryption Engine with Flexible Key Size. International journal of communication, vol.
1, n. 3, pp. 107-113, 2007.

[11].Vibhor, G., Aruna, V., Architectural analysis of RSA crypto system on FPGA.
International Journal of Computer Applications, vol. 26, n. 8, pp. 30-34, 2011.
#9P4A=>F*C98*'45QA9F4*"5B9A5>D=45>8*C9*#RB4C4F*)@PRA=D4F*95*"5Q95=9AS>*T*'=95D=>F*&U8=D>C>F:*'"#.)"'!*.'!OY[

[12].Soderquist, P. & Leeser, M., Division and square root: choosing the right
implementation. IEEE Micro, vol. 17, n. 4, pp. 56-66, 1997.

[13]. Cormen, T., Leiserson, C., Rivest, R., & Stein, C., Introduction to Algorithms, pp. 859-
861. MIT Press and McGraw-Hill, 2001.

Das könnte Ihnen auch gefallen