Beruflich Dokumente
Kultur Dokumente
Cryptography
Prepared by Kristian Guillaumier
Department of Intelligent Computer Systems
Cryptography
Note: All content in this section from An Introduction to Cryptography, Linz
and corresponding Wikipedia articles (references checked).
Cryptography
Cryptographer: someone who studies
cryptography.
Cryptanalysis: the mathematical study of
defeating cryptographic methods.
Steganography: concealing the presence of
the existence of a message in the first place
(e.g. false-bottomed suitcase, invisible ink)
Substitution cipher.
Shift cipher.
Monographic substitution.
Shift (of 3) as in original version.
Caesar Cipher
Implementing the algorithm using modulo
arithmetic.
Assign a number to each letter such that A=0, B=1,
C=3, and so on.
Know the size of the alphabet. In English, this
would be 26.
To encrypt a letter x by some shift n:
En(x) = (x+n) % 26.
Example
PT Letter
PT Number
Function shift 3
CT Number
CT Letter
(0+3)%26
(1+3)%26
(2+3)%26
23
(23+3)%26
24
(24+3)%26
25
(25+3)%26
En(Em(x)) == En+m(x)
CS can be broken trivially.
10
Website
US Army Field Manual for Basic Cryptanalysis:
http://www.umich.edu/~umich/fm-34-40-2/
11
Frequency Analysis
The frequency of occurrence of letters (or groups of
letters as in a digraph) in a language.
Example: E, T, A and O most common in English, while
Z, Q and X are least common.
TH, ER, ON, and AN are the most common digraphs.
SS , EE , TT , and FF are the most common repeats.
In some cryptographic systems these patens in the
plain text manifest themselves in the cypher text as
well (as in the Caesar cypher) and can be exploited.
12
13
Example
(see full at http://en.wikipedia.org/wiki/Frequency_analysis)
LIVITCSWPIYVEWHEVSRIQMXLEYVEOIEWHRXEXIPFE
MVEWHKVSTYLXZIXLIKIIXPIJVSZEYPERRGERIM
WQLMGLMXQERIWGPSRIHMXQEREKIETXMJTPRGEVEKE
ITREWHEXXLEXXMZITWAWSQWXSWEXTVEPMRXRSJ
GSTVRIEYVIEXCVMUIMWERGMIWXMJMGCSMWXSJOMIQ
XLIVIQIVIXQSVSTWHKPEGARCSXRWIEVSWIIBXV
IZMXFSJXLIKEGAEWHEPSWYSWIWIEVXLISXLIVXLIR
GEPIRQIVIIBGIIHMWYPFLEVHEWHYPSRRFQMXLE
PPXLIECCIEVEWGISJKTVWMRLIHYSPHXLIQIMYLXSJ
XLIMWRIGXQEROIVFVIZEVAEKPIEWHXEAMWYEPP
XLMWYRMWXSGSWRMHIVEXMSWMGSTPHLEVHPFKPEZIN
TCMXIVJSVLMRSCMWMSWVIRCIGXMWYMX
14
Example
(see full at http://en.wikipedia.org/wiki/Frequency_analysis)
Convention: we use upper case for cypher text characters and lower
for plain text characters.
In cryptogram:
I is the most common single letter.
XL most common bigram.
XLI is the most common trigram.
In English:
e is the most common single letter.
th is the most common bigram.
the is the most common trigram.
Example
(see full at http://en.wikipedia.org/wiki/Frequency_analysis)
In cryptogram:
E is the second most common single letter.
In English:
t is the second most common single letter but t is
already accounted for in our previous guesses so we
use a which is third most common single letter.
heVe == here??
Rtate == state??
V~r
R~s
heVeTCSWPeYVaWHaVSReQMthaYVaOeaWHRtatePFa
MVaWHKVSTYhtZetheKeetPeJVSZaYPaRRGaReM
WQhMGhMtQaReWGPSReHMtQaRaKeaTtMJTPRGaVaKa
eTRaWHatthattMZeTWAWSQWtSWatTVaPMRtRSJ
GSTVReaYVeatCVMUeMWaRGMeWtMJMGCSMWtSJOMeQ
theVeQeVetQSVSTWHKPaGARCStRWeaVSWeeBtV
eZMtFSJtheKaGAaWHaPSWYSWeWeaVtheStheVtheR
GaPeRQeVeeBGeeHMWYPFhaVHaWHYPSRRFQMtha
PPtheaCCeaVaWGeSJKTVWMRheHYSPHtheQeMYhtSJ
atthattMZe == at that time??
theMWReGtQaROeVFVeZaVAaKPeaWHtaAMWYaPP
thMWYRMWtSGSWRMHeVatMSWMGSTPHhaVHPFKPaZeN
M~i, Z~m
TCMteVJSVhMRSCMWMSWVeRCeGtMWYMt
17
hereTCSWPeYraWHarSseQithaYraOeaWHstatePFa
iraWHKrSTYhtmetheKeetPeJrSmaYPassGasei
WQhiGhitQaseWGPSseHitQasaKeaTtiJTPsGaraKa
eTsaWHatthattimeTWAWSQWtSWatTraPistsSJ
GSTrseaYreatCriUeiWasGieWtiJiGCSiWtSJOieQ
thereQeretQSrSTWHKPaGAsCStsWearSWeeBtr
remarA == remark??
emitFSJtheKaGAaWHaPSWYSWeWeartheStherthes
A~k
GaPesQereeBGeeHiWYPFharHaWHYPSssFQitha
PPtheaCCearaWGeSJKTrWisheHYSPHtheQeiYhtSJ
theiWseGtQasOerFremarAaKPeaWHtaAiWYaPP
thiWYsiWtSGSWsiHeratiSWiGSTPHharHPFKPameN
TCiterJSrhisSCiWiSWresCeGtiWYit
18
hereTCSWPeYraWHarSseQithaYraOeaWHstatePFa
iraWHKrSTYhtmetheKeetPeJrSmaYPassGasei
WQhiGhitQaseWGPSseHitQasaKeaTtiJTPsGaraKa
eTsaWHatthattimeTWkWSQWtSWatTraPistsSJ
With?
GSTrseaYreatCriUeiWasGieWtiJiGCSiWtSJOieQ
thereQeretQSrSTWHKPaGksCStsWearSWeeBtr
Which?
emitFSJtheKaGkaWHaPSWYSWeWeartheStherthes
GaPesQereeBGeeHiWYPFharHaWHYPSssFQitha
PPtheaCCearaWGeSJKTrWisheHYSPHtheQeiYhtSJ
theiWseGtQasOerFremarkaKPeaWHtakiWYaPP
thiWYsiWtSGSWsiHeratiSWiGSTPHharHPFKPameN
TCiterJSrhisSCiWiSWresCeGtiWYit
19
Example
(see full at http://en.wikipedia.org/wiki/Frequency_analysis)
hereuponlegrandarosewithagraveandstatelya
irandbroughtmethebeetlefromaglasscasei
nwhichitwasencloseditwasabeautifulscaraba
eusandatthattimeunknowntonaturalists
Hereupon Legrand arose, with a grave and stately air, and
brought me the beetle
from a glass case in which it was enclosed. It was a beautiful
scarabaeus, and, at
that time, unknown to naturalistsof course a great prize in
a scientific point
20
Vigener cipher
cryptocryptocrypt
whatanicedaytoday
Add mode 26
zzzjucludtunwqcqs
Interesting property first encryption of p.t. a is z and second encryption
of a is u. Not fixed.
21
22
Polygraphic ciphers
As opposed to monographic ciphers.
Digraph substitution.
Monograph (a letter), Digraph (a pair of
letters).
Frequency analysis is much harder:
Wed need the frequency analysis of 600 digraphs
(in a Playfair cypher) rather than 26 monographs.
Explanation of 600 on next slide.
23
Number of digraphs
Consider an alphabet with 4 characters a, b, c and d.
How many pair combinations could we have: 42. aa, ab, ac, ad, ba, bc, bd,
dd.
What if we dont allow duplicates? We have to remove 4 possibilities: aa,
bb, cc, dd.
So the number of pairs without duplicates in an alphabet containing n
characters is n2-n.
What if we combine one character to give us: a, b, c/d?
How many pairs do we get in an n letter alphabet if we omit/combine m
letters: (n-m)2.
Again if we wish to omit duplicate pairs wed get: (n-m)2 (n-m).
In Playfair, we combine one 2 letters from the 26-letter alphabet so we
get: (26-1)2 (26-1) = 252-25 = 600.
24
25
26
A
E
O
H
V
Z
U
N
F
R
I
T
K
J
P
WX
G
Q
L
B
D
Y
M
S
C
27
Rule 1
If 2 letters are on the same row, their cypher text is
immediately to their right (use wrap-around).
E.g. VC is RV
A
E
O
H
V
Z
U
N
F
R
I
T
K
J
P
WX
G
Q
L
B
D
Y
M
S
C
28
Rule 2
If two letters are on the same column the cypher text is the
letters below (careful to apply wrap-around too if necessary).
E.g. ZF is UR.
A
E
O
H
V
Z
U
N
F
R
I
T
K
J
P
WX
G
Q
L
B
D
Y
M
S
C
29
Rule 3
If two letters are on the diagonal of a rectangle formed by them, then the
cyphers are the equivalents on the corners of the opposite diameter on
the same rows as the plain text letters.
E.g. UL = GF and SZ = FD.
A
E
O
H
V
Z
U
N
F
R
I
T
K
J
P
WX
D
G Y
Q M
L S
B C
A
E
O
H
V
Z
U
N
F
R
I
T
K
J
P
WX
D
G Y
Q M
L S
B C
30
Rule 4+5
If the same letter appears as a pair in the plain
text, separate them with a Z before
encrypting.
If a single letter appears at the end of the
plain text when encrypting (there is an odd
number of letters to encrypt), pad with a Z.
31
Example
BUTTON
BUTZTON
BU TZ TO N
BU TZ TO NZ
Becomes
A
E
O
H
V
Z
U
N
F
R
I
T
K
J
P
WX
D
G Y
Q M
L S
B C
RG UI EK FU
32
Cryptanalysis
The cypher text will never contain double letter digraphs
(pairs). If this observation is made over a suitably long stream
of cypher text (to make it statistically significant) we can infer
that Playfair is used.
If both the plain text and the cypher text are available then
finding the key is straightforward (assuming we have enough
text).
Try out: BU TZ TO NZ CT UT VK SU
33
Cryptanalysis
In Playfair, a digraph and its reverse (e.g. AB and
BA) will decrypt to plain text in reverse (e.g. RE
and ER).
In English, there are many words which contain
these reversed digraphs such as REceivER and
DEpartED.
Identifying nearby reversed digraphs in the
ciphertext and matching the pattern to a list of
known plaintext words containing the pattern is
an easy way to generate possible plaintext strings
with which to begin constructing the key.
34
Cryptanalysis
Random-restart hill climbing.
Start with random square of letters.
Create mutation operations (swap letters,
swap rows/columns, reflecting.
Score the obtained plaintext with some fitness
function, e.g. comparing digraphs to a
frequency chart.
35
Cryptanalysis: an example
Taken from a forum challenge http://s13.zetaboards.com/Crypto/topic/1237
30/1/
36
Challenge
TM
EK
BH
AV
CN
RU
ES
NX
TV
AN
HX
RX
IS
BP
LR
GV
KT
IQ
MI
EU
VA
QG
SU
GI
NK
AS
LE
HU
CR
GZ
VO
IS
HV
VA
RE
XE
KH
VA
EU
AS
SF
IR
EW
IC
SF
LE
HB
GZ
XE
EG
NH
VA
BM
CI
KN
TY
VK
NB
AR
HA
HY
HG
AB
GS
TM
BV
LX
BM
GC
IU
MH
SA
NI
VC
AR
RC
XM
VS
VE
BF
BU
IK
EV
KN
IV
ST
NX
BS
37
The crib
"turkey eating title".
Possible ways to split in pairs:
?t ur ke ye at in gt it le
tu rk ey ea ti ng ti tl e?
Interesting pattern in 2nd option. Lets try that.
38
tu rk ey ea ti ng ti tl e?
TM
EK
BH
AV
CN
RU
ES
NX
TV
AN
HX
RX
IS
BP
LR
GV
KT
IQ
MI
EU
VA
QG
SU
GI
NK
AS
LE
HU
CR
GZ
VO
IS
HV
VA
RE
XE
KH
VA
EU
AS
SF
IR
EW
IC
SF
LE
HB
GZ
XE
EG
NH
VA
BM
CI
KN
TY
VK
NB
AR
HA
HY
HG
AB
GS
TM
BV
LX
BM
GC
IU
MH
SA
NI
VC
AR
RC
XM
VS
VE
BF
BU
IK
EV
KN
IV
ST
NX
BS
39
Cipher
tu
ST
rk
CN
ey
RX
ea
MI
ti
AS
ng
HV
ti
AS
tl
HB
e?
CI
41
t
Impossibe
T
42
U
U
T
S
U
T
S
U
T
I
44
I
E
45
IJ
IJ
M
46
U
T
ey=RX
47
48
Plain text
square.
Plain text
square.
50
Encryption
Split the plain text in digraphs:
THE SLICK BIRD
TH ES LI CK BI RD
Find the first digraph letter TH in the upper-left PT matrix.
Find the second digraph letter TH in the lower-right PT
matrix.
51
Encryption
The first
cypher letter
in the digraph
is on the
same row as
the first plain
text letter and
the same
column of the
second plain
text letter.
52
Encryption
The second
cypher letter in
the digraph is
on the same
row as the
second plain
text letter and
the same
column of the
first plain text
letter.
53
th RB
2
54
Encryption
The slick bird
=
TH ES LI CK
BI
RD
MD
SE
Encrypted
RB
AS
JD
EH
55
Cryptanalysis
Similar ideas to Playfair if both plain text and
cipher text are known.
Consider what happens when we encrypt
MI/LI/TA/RY, for MI we get JA and for LI
we get JD.
Notice the repetition of J in the cipher text.
This happens because M and L are on the
same row in the top left plain text square and the
I is the same. This is an exploitable pattern.
56
Cryptanalysis
Difference from Playfair:
Four-Square will not show reversed cypher
text digraphs for reversed plain text digraphs.
Consider DEpartED becomes
PWnksmMO
PW and MO are not reversals of each other.
This makes four-square stronger than Playfair.
57
I/J
58
Polybius square
Left hand/right hand raising of torches.
Knock-knock.
59
ADFGVX cipher
The ADFGVX cypher (originally ADFGX but later added the V
to become 6x6 to include digits and shorten transmissions)
starts off with a Ploybius square indexed by ADFGVX.
Each letter in ADFGVX sounds very distinct in Morse code.
A
0
60
ADFGVX cipher
SEND 20 is fractionated as
S
E
N
D
2
0
FF FG XA GF DX DG
A
0
61
ADFGVX cipher
The British have landed would be enciphered
as XF FX FG AA AG AX XF AX FF FX FX DA GX FG AV
DA XA GF FG GF.
This is the fractionated text or transitional cypher
text.
So far the this is a simple one-to-one substitution
(cryptographically useless by itself).
The next step involves a key. E.g. the word
German.
62
ADFGVX cipher
Create a table with the key (German) as the heading and place all the
characters in the transitional cypher text horizontally in it. Note the red
arrows that indicate the fill order.
XF FX FG AA AG AX XF AX FF FX FX DA GX FG AV DA XA GF FG GF
F
63
ADFGVX cipher
Now sort the table by the letters in the
keyword (moving the columns with the letter
of the keyword).
G
G
64
ADFGVX cipher
Now read the values column wise grouping in, say, 5 letter
blocks (for convenience when reading).
FAFDA
GFAFX
XAGXA
XFGDF XGXXG
AFGXF
AVFFA
AFFXG
65
Hint on ADFGVX
If we notice that the ciphertext only consists
of 6 letters and has an even number of letters
then we could assume a 6x6 board and that
were dealing with digraphs.
Frequency analysis matches (indicates)
plaintext for the language being assumed but
performing it will not give the plaintext. This is
a hint that transposition is used.
67
68
MY NAME IS KRIS
FF VV FG AA FF AV DG GG FA GF DG GG
MY NAME IS JOHNNY
FF VV FG AA FF AV DG GG DV FV DF FG FG VV
69
Sort key
70
Sort key
71
72
MY NAME IS JOHNNY
MY NAME IS KRIS
Pick the longest ones, say we pick the longest 3 (in fact 3 is the right
guess).
Now remember what the tables where:
Now we can guess the column length.
A
C
T
F
73
And now we
have the key
74
Finally
Read row-wise and use frequency analysis to
beat the substitution.
ADFGVX was cryptanalyzed for special cases
by Georges Painvin in 1918.
Cryptanalysis for the general case was found
by William Friedman in 1933.
More info at:
http://www.nku.edu/~christensen/section%2010
%20ADFGVX.pdf
75
77
What is a cipher?
A cipher is a pair of efficient algorithms E and
D over the triple (: key space, : message
space, : ciphertext space).
E:
D:
Consistency:
, : , , =
Efficient means runs in polynomial time.
79
= , = = = 0 =
81
Note
If I know the cipher text c and the message m, it
is easy to get the key.
=
One time pad is very, very fast.
Problem to use in practice the key must be as
long as the message.
If Alice has a secure method to communicate the
key with Bob before sending the message, then
Alice might as well use that method to send the
message.
82
Security of OTP
What is a secure cipher?
Cipher text should not reveal anything about the plaintext.
A cipher (E,D) over , , has perfect secrecy (Claude Shannon) if:
0 , 1 . . 0 = 1
Pr , 0 = = Pr[ , 1 = ]
i.e. for some random key k, the probability of getting a ciphertext c from 0 is
the same as having got it from any other message 1 .
In other words, if I only have they ciphertext c, I can never tell if the message
was 0 or 1 .
There is no ciphertext-only attack (other attacks may be possible).
83
Issues
OTP has long keys.
Is there another cipher that has perfect secrecy?
Shannon has proven that for perfect secrecy
in other words, the length of the key
must be length of the message (bad news).
Perfect secrecy ciphers have practical issues.
The key needs to be truly random (cannot be
predictable).
May-time pad attacks.
86
Stream ciphers
The idea is to replace the random key with a
pseudorandom key.
A pseudorandom number generator (PSG) is a
function:
: 0,1 0,1 . .
i.e. a function that takes a value called the seed (say,
128 bits) and give you a much larger output stream
(e.g. Gb long).
G is a deterministic. The only random thing is the
seed.
The output should look random.
88
+1
Predictable
Many random number generators in
frameworks/libraries are predictable (e.g.
random() in C).
Do not use for crypto.
Use specific cryptographically secure PRGs.
91
Attacks on an OPT
So in order to make the OTP practical, our cipher
is:
, = ()
, = ()
1 2 = 1 2
= 0 1 2 = 1 2
1 and 2 can be recovered from 1 2 .
93
Server side:
Sends messages 1 , 2 , 3 ,
Encryption is 1 2 3 ()
Where is concatenation. i.e. all server messages are one stream.
The problem is that the client and server are communicating using
the same key the pad is used twice.
Never use the same key and .
The proper key should be a pair ( , ).
94
95
A better alternative
Treat all the frames as long a stream
1 2 3 and with ().
First segment of pad encrypts 1 , second
segment of pad encrypts 2 , etc
Each segment of the pad is random and
unrelated.
97
Attacker:
Sees # #
Designs a such that # # =
# #.
Receiver:
Gets # #.
Decrypts it to " ".
99
42
6F
62
45
76
65
07
19
07
100