Sie sind auf Seite 1von 38

Cryptography and Network Security

Spring 2006
http://www.abo.fi/~ipetre/crypto/

Lecture 2: Classical encryption

Ion Petre
Academy of Finland and
Department of IT, Åbo Akademi University

March 23, 2006 1


Part I. Cryptography

„ Will cover more than half of this course


„ I.1 Secret-key cryptography
‰ Also called symmetric or conventional cryptography
‰ Five ingredients
„ Plaintext
„ Encryption algorithm: runs on the plaintext and the encryption key to yield the ciphertext
„ Secret key: an input to the encryption algorithm, value independent of the plaintext;
different keys will yield different outputs
„ Ciphertext: the scrambled text produced as an output by the encryption algorithm
„ Decryption algorithm: runs on the ciphertext and the key to produce the plaintext
‰ Requirements for secure conventional encryption
„ Strong encryption algorithm
‰ An opponent who knows one or more ciphertexts would not be able to find the plaintexts or the key
‰ Ideally, even if he knows one or more pairs plaintext-ciphertext, he would not be able to find the key
„ Sender and receiver must share the same key. Once the key is compromised, all
communications using that key are readable
„ It is impractical to decrypt the message on the basis of the ciphertext plus the knowledge
of the encryption algorithm Æ encryption algorithm is not a secret

March 23, 2006 2


Cryptography – some notations

„ Notation for relating the plaintext, ciphertext, and the keys


‰ C=EK(P) denotes that C is the encryption of the plaintext P using the
key K
‰ P=DK(C) denotes that P is the decryption of the ciphertext C using the
key K
‰ Then DK(EK(P))=P

March 23, 2006 3


Caesar Cipher

„ It is a typical substitution cipher and the oldest known – attributed to Julius


Caesar
„ Simple rule: replace each letter of the alphabet with the letter standing 3
places further down the alphabet
„ Example:
MEET ME AFTER THE TOGA PARTY
PHHW PH DIWHU WKH WRJD SDUWB
„ Here the key is 3 – choose another key to get a different substitution
„ The alphabet is wrapped around so that after Z follows A:
a b c d e f g h i j k l m n o p q r s t u v w x y z
D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

March 23, 2006 4


Caesar cipher

„ Mathematically give each letter a number

a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

„ The key is a number from to 25


„ Caesar cipher can now be given as

‰ E(p) = (p + k) mod (26)


‰ D(C) = (C – k) mod (26)

March 23, 2006 5


Attacking Caesar

„ Caesar can be broken if we only know one pair (plain letter,


encrypted letter)
‰ The difference between them is the key

„ Caesar can be broken even if we only have the encrypted text and
no knowledge of the plaintext
‰ Brute-force attack is easy: there are only 25 keys possible
‰ Try all 25 keys and check to see which key gives an intelligible message

March 23, 2006 6


Why is Caesar easy to break?

„ Only 25 keys to try


„ The language of the
plaintext is known and easily
recognizable
‰ What if the language is
unknown?
‰ What if the plaintext is a
binary file of an unknown
format?

From Stallings – “Cryptography and


Network Security”

March 23, 2006 7


Strengthening Caesar: monoalphabetic ciphers

„ Caesar only has 25 possible keys – far from secure


„ Idea: instead of shifting the letters with a fixed amount how about allowing
any permutation of the alphabet
Plain: abcdefghijklmnopqrstuvwxyz
Cipher: DKVQFIBJWPESCXHTMYAUOLRGZN

Plaintext: if we wish to replace letters


Ciphertext: WI RF RWAJ UH YFTSDVF SFUUFYA

„ This is called monoalphabetic susbstitution cipher – a single alphabet is


used
„ The increase in the number of keys is dramatic: 26!, i.e., more than 4x1026
possible keys
„ Compare: DES only has an order of 1016 possible keys

March 23, 2006 8


How large is large?
Reference Order of magnitude

Seconds in a year ≈ 3 x 107

Age of our solar system (years) ≈ 6 x 109

Seconds since creation of solar system ≈ 2 x 1017

Clock cycles per year, 3 GHz computer ≈ 9.6 x 1016

Binary strings of length 64 264 ≈ 1.8 x 1019

Binary strings of length 128 2128 ≈ 3.4 x 1038

Binary strings of length 256 2256 ≈ 1.2 x 1077

Number of 75-digit prime numbers ≈ 5.2 x 1072

Electrons in the universe ≈ 8.37 x 1077

Adapted from Handbook of Applied Cryptography (A.Menezes, P.van Oorschot, S.Vanstone), 1996

March 23, 2006 9


Monoalphabetic ciphers

„ Having 1016 possible keys appears to make the system challenging:


difficult to perform brute-force attacks
„ There is however another line of attack that easily defeats the
system even when a relatively small ciphertext is known
‰ If the cryptanalyst knows the nature of the text, e.g., noncompressed
English text, then he can exploit the regularities of the language

March 23, 2006 10


Language redundancy and cryptanalysis

„ Human languages are redundant


„ Letters are not equally commonly used
‰ In English E is by far the most common letter
‰ Follows T,R,N,I,O,A,S
‰ Other letters are fairly rare
„ See Z,J,K,Q,X
„ Tables of single, double & triple letter frequencies exist
‰ Most common digram in English is TH
‰ Most common trigram in English in THE

March 23, 2006 11


English Letter Frequencies

March 23, 2006 12


Cryptanalysis of monoalphabetic ciphers

„ Key concept - monoalphabetic substitution ciphers do not change relative


letter frequencies
‰ Discovered by Arabs in the 9th century
„ Calculate letter frequencies for ciphertext
„ Compare counts/plots against known values
‰ Most frequent letter in the ciphertext may well encrypt E
‰ The next one could encrypt T or A
‰ After relatively few tries the system is broken
‰ If the ciphertext is relatively short (and so, the frequencies are not fully relevant)
then more guesses may be needed
„ Powerful tool: look at the frequency of two-letter combinations (digrams)

March 23, 2006 13


Example of cryptanalysis

„ Ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZOWSFPAPPDTSVPQUZ
WYMXUZUHSXEPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ

„ Count relative letter frequencies: P is the most frequent (13.33%), followed by


Z (11.67), S (8.33), U (8.33), O (7.5), M (6.67), H (5.83), etc.
‰ Guess P and Z stand for E and T but the order is not clear because of small
difference in the frequency
‰ The next set of letters {S,U, O, M, H} may stand for {A, H, I, N, O, R, S} but again it
is not completely clear which is which
‰ One may try to guess and see how the text translates
‰ Also, a good guess is that ZW, the most common digram in the ciphertext, is TH, the
most common digram in English: thus, ZWP is THE
„ Proceed with trial and error and finally get after inserting the proper blanks:
it was disclosed yesterday that several informal but direct contacts have
been made with political representatives of the viet cong in moscow

March 23, 2006 14


Some conclusions after this cryptanalysis

„ Monoalphabetic ciphers are easy to break because they reflect the


frequency of the original alphabet
‰ Essential to know the original alphabet
„ Countermeasure: provide multiple substitutes for a given letter
‰ Highly frequent letters such as E could be encrypted using a larger number of
letters than less frequent letters such as Z: to encrypt E one could choose either
one of, say 15 fixed letters, and to encrypt Z one could choose either one of, say
2 fixed letters
‰ The number of encryptions for a letter may be proportional with the frequency
rate in the original language (English)
‰ This should (intuitively) hide the frequency information
‰ Wrong: Multiple-letter patterns (digrams, trigrams, etc) survive in the text
providing a tool for cryptanalysis
„ Each element of the plaintext only affects one element in the ciphertext
„ Longer text needed for breaking the system

March 23, 2006 15


Measures to hide the structure of the plaintext

1. Encrypt multiple letters of the plaintext at once


2. Use more than one substitution in encryption and decryption
(polyalphabetic ciphers)

„ Consider both these approaches in the following

March 23, 2006 16


Playfair Cipher

„ The Playfair Cipher is an example of multiple-letter encryption


„ Invented by Sir Charles Wheatstone in 1854, but named after his
friend Baron Playfair who championed the cipher at the British
foreign office
„ Based on the use of a 5x5 matrix in which the letters of the alphabet
are written (I is considered the same as J)
‰ This is called key matrix

March 23, 2006 17


Playfair key matrix

„ A 5X5 matrix of letters based on a keyword


„ Fill in letters of keyword (no duplicates)
‰ Left to right, top to bottom
‰ Fill the rest of matrix with the other letters in alphabetic order
„ E.g. using the keyword MONARCHY, we obtain the following matrix
M O N A R
C H Y B D
E F G I K
L P Q S T
U V W X Z

March 23, 2006 18


Encrypting and decrypting with Playfair

„ The plaintext is encrypted two letters at a time:

1. Break the plaintext into pairs of two consecutive letters


2. If a pair is a repeated letter, insert a filler like 'X‘ in the plaintext, eg. "balloon" is
treated as "ba lx lo on"
3. If both letters fall in the same row of the key matrix, replace each with the letter
to its right (wrapping back to start from end), eg. “AR" encrypts as "RM"
4. If both letters fall in the same column, replace each with the letter below it (again
wrapping to top from bottom), eg. “MU" encrypts to "CM"
5. Otherwise each letter is replaced by the one in its row in the column of the other
letter of the pair, eg. “HS" encrypts to "BP", and “EA" to "IM" or "JM" (as desired)

„ Decryption works in the reverse direction


„ The examples above are based on this key matrix:
M O N A R M O N A R
C H Y B D C H Y B D
E F G I K E F G I K
L P Q S T L P Q S T
U V W X Z U V W X Z

March 23, 2006 19


Security of Playfair

„ Security much improved over monoalphabetic


‰ There are 26 x 26 = 676 digrams
„ Needs a 676 entry digram frequency table to analyse (vs. 26 for a
monoalphabetic) and correspondingly more ciphertext
„ Widely used for many years (eg. US & British military in WW I, other
allied forces in WW II)
„ Can be broken, given a few hundred letters
‰ Still has much of plaintext structure

March 23, 2006 20


Measures to hide the structure of the plaintext

1. Encrypt multiple letters of the plaintext at once


2. Use more than one substitution in encryption and decryption
(polyalphabetic ciphers)

March 23, 2006 21


Polyalphabetic substitution ciphers

„ Idea: use different monoalphabetic substitutions as one proceeds


through the plaintext
„ Makes cryptanalysis harder with more alphabets (substitutions) to
guess and flattens frequency distribution
„ A key determines which particular substitution is used in each step
„ Example: the Vigenère cipher

March 23, 2006 22


Vigenère Cipher

„ Proposed by Giovan Batista Belaso (1553) and reinvented by Blaise


de Vigenère (1586), called “le chiffre indéchiffrable” for 300 years
„ Effectively multiple Caesar ciphers
„ Key is a word K = k1 k2 ... kd
„ Encryption
‰ Read one letter t from the plaintext and one letter k from the key
‰ t is encrypted according to the Caesar cipher with key k
‰ When the key word is finished, start the reading of the key from the beginning
„ Decryption works in reverse
„ Example: key is “bcde”; “testing” is encrypted as “ugvxjpj”
‰ Note that the two ‘t’ are encrypted by different letters: ‘u’ and ‘x’
‰ The two ‘j’ in the cryptotext come from different plain letters: ‘i’ and ‘j’

March 23, 2006 23


March 23, 2006 24
Plaintext letters here
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Vigenere tableau A ABCDEFGHIJKLMNOPQRSTUVWXYZ
B BCDEFGHIJKLMNOPQRSTUVWXYZA

Key letters here


C CDEFGHIJKLMNOPQRSTUVWXYZAB
D DEFGHIJKLMNOPQRSTUVWXYZABC
E EFGHIJKLMNOPQRSTUVWXYZABCD
F FGHIJKLMNOPQRSTUVWXYZABCDE
G GHIJKLMNOPQRSTUVWXYZABCDEF
H HIJKLMNOPQRSTUVWXYZABCDEFG
I IJKLMNOPQRSTUVWXYZABCDEFGH
J JKLMNOPQRSTUVWXYZABCDEFGHI
Example K KLMNOPQRSTUVWXYZABCDEFGHIJ
L LMNOPQRSTUVWXYZABCDEFGHIJK
• write the plaintext out M MNOPQRSTUVWXYZABCDEFGHIJKL
• write the keyword repeated above it N NOPQRSTUVWXYZABCDEFGHIJKLM
• use each key letter as a Caesar cipher key O OPQRSTUVWXYZABCDEFGHIJKLMN
P PQRSTUVWXYZABCDEFGHIJKLMNO
• encrypt the corresponding plaintext letter Q QRSTUVWXYZABCDEFGHIJKLMNOP
• eg using keyword deceptive R RSTUVWXYZABCDEFGHIJKLMNOPQ
S STUVWXYZABCDEFGHIJKLMNOPQR
plain: wearediscoveredsaveyourself T TUVWXYZABCDEFGHIJKLMNOPQRS
key: deceptivedeceptivedeceptive U UVWXYZABCDEFGHIJKLMNOPQRST
cipher: ZICVTWQNGRZGVTWAVZHCQYGLMGJ V VWXYZABCDEFGHIJKLMNOPQRSTU
W WXYZABCDEFGHIJKLMNOPQRSTUV
X XYZABCDEFGHIJKLMNOPQRSTUVW
Y YZABCDEFGHIJKLMNOPQRSTUVWX
March 23, 2006 Z 25 X Y
ZABCDEFGHIJKLMNOPQRSTUVW
Security of Vigenère Ciphers

„ Its strength lays in the fact that each plaintext letter has multiple
ciphertext letters
‰ Letter frequencies are obscured (but not totally lost)
„ Breaking Vigenère
‰ If we need to decide if the text was encrypted with a monoalphabetic
cipher or with Vigenère:
„ Start with letter frequencies
„ See if it “looks” monoalphabetic or not: the frequencies should be those of
letters in English texts
„ If not, then it is Vigenère

March 23, 2006 26


Breaking Vigenère: the Kasiski Method (cryptotext only)

„ Method developed by Babbage (1854) / Kasiski (1863)


‰ Famous incident with breaking the Zimmerman telegram (Jan 16, 1917)
„ We need to find the key word and for this, we first find its length
‰ Idea: if the length is N, then the letters on positions 1, N+1, 2N+1, 3N+1, etc are encrypted with
Caesar; same for letters on positions i, N+i, 2N+i, 3N+i, etc., where i runs from 1 to N
‰ Clearly, if we deduce the length of the key word, then breaking the system is easy: break N
Caesar systems
„ Finding the length of the key word
‰ If plaintext starts with “the” (encrypted say by “XYZ”) and “the” also occurs starting from
position N+1, then 2nd occurrence of “the” will also be encrypted by “XYZ”
‰ Idea: repetitions in ciphertext give clues to period
‰ Approach: find a piece of ciphertext that is repeated several times (say, at distance 6, 9, 18, 9
from each other)
‰ If they really come from the same piece of plaintext, then the length of the key word will be a
divisor of all those distances (in our example, the length of the key word must be 3)

Example
plain: wearediscoveredsaveyourself
key: deceptivedeceptivedeceptive
cipher: ZICVTWQNGRZGVTWAVZHCQYGLMGJ

March 23, 2006 27


Improvement on Vigenère: autokey system

„ If the key were as long as the message, then the system would be
defended against the previous attack
„ Vigenère proposed the autokey cipher
‰ the keyword is followed by the message itself (see example bellow)
„ Decryption
‰ Knowing the keyword can recover the first few letters
‰ Use these in turn on the rest of the message
„ Note: the system still has frequency characteristics to attack and can be
rather easily defeated
„ Example: the key is deceptive
„ Weakness: plaintext and key share the same statistical distribution of
letters

plaintext: wearediscoveredsaveyourself
key: deceptivewearediscoveredsav
ciphertext: ZICVTWQNGKZEIIGASXSTSLVVWLA

March 23, 2006 28


One-Time pad

„ The idea of the autokey system can be extended to create an


unbreakable system: one-time pad
„ Idea: use a (truly) random key as long as the plaintext
„ It is unbreakable since the ciphertext bears no statistical
relationship to the plaintext
„ Moreover, for any plaintext & any ciphertext there exists a key
mapping one to the other
‰ Thus, a ciphertext can be decrypted to any plaintext of the same length
‰ The cryptanalyst is in an impossible situation

March 23, 2006 29


Security of the one-time pad

„ The security is entirely given by the randomness of the key


‰ If the key is truly random, then the ciphertext is random
‰ A key can only be used once if the cryptanalyst is to be kept in the
“dark”
„ Problems with this “perfect” cryptosystem
‰ Making large quantities of truly random characters is a significant
task
‰ Key distribution is enormously difficult: for any message to be sent, a
key of equal length must be available to both parties

March 23, 2006 30


Other technique of encryption: transpositions

„ We have considered so far substitutions to hide the plaintext: each


letter is mapped into a letter according to some substitution
„ Different idea: perform some sort of permutation on the plaintext
letters
‰ Hide the message by rearranging the letter order without altering the
actual letters used
„ The simplest such technique: rail fence technique

March 23, 2006 31


Rail Fence cipher

„ Idea: write plaintext letters diagonally over a number of rows, then


read off cipher row by row
„ E.g., with a rail fence of depth 2, to encrypt the text “meet me after
the toga party”, write message out as:
m e m a t r h t g p r y
e t e f e t e o a a t
„ Ciphertext is read from the above row-by-row:
MEMATRHTGPRYETEFETEOAAT
„ Attack: this is easily recognized because it has the same frequency
distribution as the original text

March 23, 2006 32


Row transposition ciphers

„ More complex scheme: row transposition


‰ Write letters of message out in rows over a specified number of columns
‰ Reading the cryptotext column-by-column, with the columns permuted
according to some key

„ Example: “attack postponed until two am” with key 4312567: first read
the column marked by 1, then the one marked by 2, etc.
Key: 4 3 1 2 5 6 7
Plaintext: a t t a c k p Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ
o s t p o n e
d u n t i l t
w o a m x y z
„ If we number the letters in the plaintext from 1 to 28, then the result of
the first encryption is the following permutation of letters from plaintext:
03 10 17 24 04 11 18 25 02 09 16 23 01 08 15 22 05 12 19 26 06 13 20 27 07 14 21 28
‰ Note the regularity of that sequence!
‰ Easily recognized!

March 23, 2006 33


Iterating the encryption makes it more secure

„ Idea: use the same scheme once more to increase security


Key: 4 3 1 2 5 6 7

Input: T T N A A P T Output: NSCYAUOPTTWLTMDNAOIEPAXTTOKZ


M T S U O A O
D W C O I X K
N L Y P E T Z

„ After the second transposition we get the following sequence of letters:

17 09 05 27 24 16 12 07 10 02 22 20 03 25 15 12 04 23 19 14 11 01 26 21 18 08 06 28

‰ This is far less structured and so, more difficult to cryptanalyze

March 23, 2006 34


Product Ciphers

„ Ciphers using substitutions or transpositions are not secure because


of language characteristics
„ Idea: using several ciphers in succession increases security
„ However:
‰ two substitutions only make another (more complex?) substitution
‰ two transpositions make another (more complex?) transposition
‰ a substitution followed by a transposition makes a new much harder
cipher
„ This is the bridge from classical to modern ciphers

March 23, 2006 35


Rotor Machines

„ Before modern ciphers, rotor machines were most common product cipher
„ Widely used in WW2
‰ German Enigma, Allied Hagelin, Japanese Purple
„ Implemented a very complex, varying substitution cipher
„ Principle: the machine has a set of independently rotating cylinders through which
electrical impulses flow
‰ Each cylinder has 26 input pins and 26 output pins with internal wiring that connects each input
pin to a unique, fixed output pin (one cylinder thus defines a monoalphabetic substitution
cipher)
‰ The output pins of one cylinder are connected to the input pins of the next cylinder
‰ After each keystroke, the last cylinder rotates one position and the others remain still
‰ After a complete rotation of the last cylinder (26 keystrokes), the cylinder before it rotates one
position, etc.
„ 3 cylinders have a period of 263=17576
„ 4 cylinders have a period of 456 976
„ 5 cylinders have a period of 11 881 376

March 23, 2006 36


The Enigma machine (pictures from Wikipedia)

March 23, 2006 37


March 23, 2006 38

Das könnte Ihnen auch gefallen