Report PDF

CONTENTS
CHAPTER 1 INTRODUCTION
CHAPTER 2 LITERATURE REVIEW
2.1
Introduction
2.2
Modern Cryptosystem
2.3
Information Theoretic Approach
CHAPTER 3 SPURIOUS KEYS AND UNICITY DISTANCE

3.1
Introduction
3.2
Spurious keys Analysis
3.3
Spurious Keys Analysis Using Natural Language Model
3.3.1.
Introduction
3.3.2.
Code Points Mapping Technique
CHAPTER 4 TEXT-SPACE BOUNDARY IN STREAM CIPHERS

4.1
Introduction
4.2
Mathematical Modeling
4.2.1
Code-point Multimapping
4.2.1.1
Spurious Keys Analysis Using Natural Language Model
4.3
Index Mapping
4.4
A Security Model for Numeric-String
CHAPTER 5 CONCLUSIONS
CHAPTER 1
INTRODUCTION
INTRODUCTION
Cryptography is used everywhere for secured communication. It basically implements

algorithm to encrypt and decrypt data using a secret key. The difficulty in retrieving the original
message from its corresponding cryptogram without knowing the secret key determines level of
security offered by the system. The strength of cryptography can be measured by understanding the
need of computational height to break the algorithm and via information theoretic approach.
Complexity of algorithm and size of key are the two factors determining the strength of
cryptography but tradeoff with the performance parameter of the system. Information theoretic
approach deals with different statistical parameters to measure the strength of cryptography on the
basis of property of text-space.
The characteristics of the encrypted/decrypted texts is a matter of concern regarding modern

crypt-system. The plain text encrypted using modern cryptographic algorithm produces a cipher text
which has no language property or readability and can easily be marked. Similarly, when a cipher
text is decrypted using a incorrect key, the decrypted output simply doesn't look like a text. This
characteristics assists adversaries to easily mark the secret text which is encrypted and comfortably
proceed towards the unique solution by elimination of incorrect keys. There exists certain set of
keys within the key-space which gives text-like-text as decrypted output and these keys are spurious
keys. Spurious keys can't be easily eliminated by cryptanalysis. The attacker require certain
background information or other means to deal with these spurious keys. The set of spurious keys
associated with the security transaction uplifts the level of secrecy in a secure communication.
The number of spurious keys is larger for small size message texts. The set of spurious keys
gradually becomes narrower as the size of the text increases. The is a point to the size at which the
probability of spurious keys is zero. It obtained by probabilistic approach is defined as the unicity
distance. Unicity distance give size of the text for which it is likely to be a single intelligible plain
text decryption when a brute-force attack is attempted. If the unicity distance tends to infinity then
the cipher is practically unbreakable with the brute-force technique. The unicity distance is
inversely proportional to redundancy of the text-space. The redundancy of text-space deals with the
those elements in the text space which when eliminated from the space does not effect the
informational value associated with texts corresponding to that text space. For example, for
alphanumeric texts all 1-byte data other then associated with alphanumeric characters are
considered as redundant data. The redundancy also deals with the characteristics of language; some
of the content when filtered from message still allows to retain the complete information of the
message.
The number of spurious keys remains a strong factor to strengthen cryptography for a short
messages. According to Shannon's theory, if the cipher text is equiprobable to all message texts
then the system is a perfect system. It means, if the number of spurious keys is fairly large then the
system tends to perfection. The use of natural language model helps to strengthen cryptography
which has been discussed in many papers. Natural language used for encryption by using utf-8
encoding standard for implementation may lead to depreciation of spurious keys because of the 3bytes weight of the character code-points. A fair implementation would be code-points mapping
technique to implement the language model. Code-point is a unique number assigned to the atomic
character of the script. The number reflect the information regarding the type of the language and
the information about the specific character. The part of the information dealing with the script say
language tag, can be extracted out which remains common to all characters of the text. And, the part
which deals with the specificity of the character is mapped to a 1-byte value which is encrypted.
Cipher text is obtained by inserting the language tag to the encrypted data. The process is reverted
for decryption process. This approach helps to fairly analyze the natural language model and trace
out the strength of cryptography. Unicity distance is comparatively bigger for natural language
model compared to English.

The plain text space is generally limited to the characters associated with the language but
the encrypted and decrypted texts can float through any 1-byte value as-far-as modern cryptosystem
is concerned. This leads to a depreciation of spurious keys. If a boundary could be set to the
encrypted-decrypted text space then the possibility of having text-like-texts at random encryption
and decryption would fairly increase. Cryptanalysis of One-time-Pad using number of plaintext
cihertext pairs gives a vision to shrink the boundary of the encrypted-decrypted text space for
stream ciphers. This comes up with a fairly longer unicity distance for cryptosystem. The possibility
of boundary to text spaces can uplift modern cryptosystems to a level of beyond cipher-text only
attack.
CHAPTER 2
LITERATURE REVIEW
2. LITERATURE REVIEW
2.1 INTRODUCTION
Several issues are growing up regarding the secrecy of information that travel in a huge mass
over the internet throughout the world. Security of information has become a public demand with
issues related to mass surveillance and attacks that has been revealed to the world. Cryptography is
a major tool concerned with security over internet network. It exists everywhere as far as the
secrecy is concerned.
There has been explosive growth in unclassified research for issues related to strengthening
cryptography. Different approaches are available to measure the strength of either block ciphers or
stream ciphers. Cryptanalysis techniques are basically used to estimate the strength of the algorithm.
Many crypt-systems that are thought to be secured were broken and a variety of tool that are useful
in cryptanalysis are developed The language to describe the security system basically relies on the
discrete probability.
2.2 MODERN CRYPTOSYSTEM
Most of the ciphers that have been examined are not really cryptographically secure,
although they may have been adequate prior to the widespread availability of computers. A
cryptosystem is called secure if a good cryptanalyst, armed with knowledge the details of the cipher
(but not the key used), would require a prohibitively large amount of computation to decipher a
plaintext. This idea (that the potential cryptanalyst knows everything but the key) is called
Kerckhoff's Law or sometimes Shannon's Maxim.
Kerckhoff's Law at first seems unreasonably strong rule; given a mass of encrypted data,
how is the cryptanalyst to know by what mean it was encrypted? Today, most encryption is done by
software or hardware that the user did not produce himself. One can reverse engineer a piece of
software (or hardware) and make the cryptographic algorithm apparent, so we cannot rely on the
secrecy of the method alone as a good measure of security. For example, when you make a purchase
over the internet, the encryption method that is used between our browser and the seller's web
server is public knowledge. The security of the transaction depends on the security of the keys.
There are several widely used ciphers which are believed to be fairly secure. Probably the
most commonly used cipher was DES (the Data Encryption Standard), which was developed in the
1970s and has been adopted as a standard by the US government. The standard implementation of
DES operates on 64-bit blocks (that is, it uses an alphabet of length 2 64-- each ``character'' is 8
bytes long), and uses a 56-bit key. Unfortunately, the 56-bit key means that DES, while secure
enough to thwart the casual cryptanalyst, is attackable with special hardware by governments, major
corporations, and probably well-heeled criminal organizations. One merely needs to try a large
number of the possible 256 keys. This is a formidable, but not insurmountable, computational effort.
A common variation of DES, called Triple-DES, uses three rounds of regular DES with three
different keys (so the key length is effectively 168 bits), and is considerably more secure. DES was
originally expected to be used for 'only a few years' when it was first designed. However, due to its
surprising resistance to attack it was generally unassailable for nearly 25 years. The powerful
methods of linear and differential cryptanalysis were developed to attack block ciphers like DES.
In the January of 1997, the National Institute for Standards and Technology (NIST) issued a
call for a new encryption standard to be developed, called AES (the Advanced Encryption
Standard). The requirements were that they operate on 128-bit blocks and support key sizes of 128,
192, and 256 bits. There were five ciphers which advanced to the second round: MARS, RC6,
Rijndael, Serpent, and Twofish. All five were stated to have ``adequate security''- Rijndael was
adopted as the standard in October of 2000.
Other commonly used ciphers are IDEA (the International Data Encryption Algorithm,
developed at the ETH Zurich in Switzerland), which uses 128-bit keys, and Blowfish (developed by
Bruce Schneier), which uses variable length keys of up to 448 bits. Both of these are currently
believed to be secure. Another common cipher, RC4 (developed by RSA Data Security) can use
variable length keys and, with sufficiently long keys, is believed to be secure. Some versions the
Netscape web browser used RC4 with 40 bit keys for secure communications. A single encrypted
session was broken in early 1995 in about 8 days using 112 networked computers; later in the same
year a second session was broken in under 32 hours. Given the speed increases in computing since
then, it is reasonable to believe that a 40-bit key can be cracked in a few hours. Notice, however,
that both of these attacks were essentially brute-force, trying a large fraction of the 2 40 possible
keys. Increasing the key size resolves that problem. Nearly all browsers these days use at least 128bit keys.
2.3 INFORMATION THEORETIC APPROACH
A cryptosystem has perfect secrecy if for any message x and any encipherment y, p(x|
y)=p(x). This implies that there must be for any message, cipher pair at least one key that connects
them. According to Shannon's theory , Suppose a cryptosystem with |K|=|C|=|P|. The cryptosystem
has perfect secrecy if and only if
each key is used with equal probability 1/|K|,
for every plaintext x and ciphertext y there is a unique key k such that e_k(x)=y.
There are certain issues which has to be addressed theoretically concerning the strength of a
cryptosystem which is mentioned in point below:

The immunity of a system to cryptanalysis when the cryptanalyst has unlimited time and
manpower available for the analysis of cryptograms
Does a cryptogram have a unique solution (even though it may require an impractical amount of
work to find it),
How much text in a given system must be intercepted before the solution becomes unique,
Are there systems for which no information could be extraced out whatever is given to the
enemy no matter how much text is intercepted
In the analysis of these problems the concepts of entropy, redundancy, unicity distance and
the like developed in A Mathematical Theory of Communication .
Shannon's entropy represents the amount of information the experimenter lacks prior to
learning the outcome of a probabilistic process. According to Shannon's formula, a message's
entropy is maximized when the occurrence of each of its individual parts is equally probable . The
entropy of a natural language is a statistical parameter that measures, how much information is
produced on an average for every letter of a text in the language . The minimum number of bits
required to encode all possible meanings of that message is the amount of information in a message.
In cryptography unicity distance is the length of an original cipher text needed to break the
cipher by reducing the number of possible spurious keys to zero in a brute force attack. That is, after
trying every possible key, there should be just one decipherment that makes sense, i.e. expected
amount of ciphertext needed to determine the key completely, assuming the underlying message has
redundancy.
CHAPTER 3
SPURIOUS KEYS
&
UNICITY DISTANCE
3. SPURIOUS KEYS AND UNICITY DISTANCE
3.1 INTRODUCTION
Shannon proposed information theoretic approach in his paper, Communication Theory of

Secrecy Systems, which explain different parameters related to the strength of cryptography. The
resistance to brute-force attack or cipher text only attack is increased if there exists large number of
text-like-texts corresponding to a cipher text. The keys which give text like text are spurious keys.
The modern cryptosystem provides fair number of spurious keys to short texts of smaller size of less
then 10 characters. This indicates that systems are more resistant to cipher text only attack to short
messages. The probability of having spurious key during random decryptions depends on the
distribution of text-like-texts in text-space. There is limitation by number of characters associated to
a language script to determine text-like-texts. An English text has alphanumeric characters with
some special characters but encrypted/decrypted texts has possibility to map a character to any 1byte value resulting to transformation which does not look like a text. The set of spurious keys
filters out invalid decryptions and leaves only text-like-texts from all possible decryptions. The
difficulty in elimination of keys and sorting out the unique solution exists with spurious keys which
can't be solved by simple attack mechanism.
The number of spurious keys gradually decreases and reaches to certain negligible point
with greater text size. There is a term which defines the threshold at which the probability of having
spurious keys is negligible, unicity distance. Consider a random 1 byte key (k K,256 possible
key) encrypts (xor) 1 letter (from 26-alphabet). The probability that the decrypted text is valid (an
alphabet) is 26/256. The probability that one of the keys is spurious would be 1/256. This
probability is attained when minimum of 3 characters are considered i .e . (26/256)^3 < 1/256. This
implies if the size of the text is 3, then none of the 256 keys would give text with alphabets as
decrypted text, which is the unicity distance. The number of possible 1-byte values in text-space
other then 26 alphabets are redundancy to the text space considered above. This redundancy is
inversely related to the unicity distance. Had there been no redundancy in text space, unicity
distance would be infinity.
3.2
SPURIOUS KEYS ANALYSIS
The analysis of spurious keys proposed in this thesis relies on discrete probability. The
observation of random decryptions applying random keys and brute-forcing using numeric keys
holds the Proof of Concept to this statistics. The discrete probability approach is applied with
consideration of uniform distribution and definition of the text spaces and desirable events. The
plaintexts has the limitation by the number of characters to used. Let us consider, English alphabets
(nA=26) are allowed set of characters to plain text. The encrypted/decrypted text can have any 1byte value(nU=n{0,1}8 =256). The probability that the decrypted text is text-like-text is called event.
Probability of event for 1-character text is simply 26/256 as there are 26 favorable conditions from
256 possibilities. The probability of events of different text size is presented in the Table 3.2.a .
S.No.
Text-Size
P.E
0.1016
0.0103
1.06*10 - 4
1.13*10 - 8
16
1.28*10 - 16
Table 3.2.a: Probability of Event for different English text-size
The probability of event to occur during random decryption is fairly large with smaller text
size and decreases gradually with size which is clearly illustrated in Table 3.2.a . With the text size
greater then 25 the probability of event happens to be of less the 2
-80
which is a negligible value
and happens not to occur in the life time. This point gives the unicity distance. The events discussed
here is directly associated with the probability of meeting spurious keys during random decryptions.
The Proof of Concept is based on the millions of decryptions performed on different modern
cryptographic algorithms. The decryptions performed were based on two approaches: random
decryption for intervals of time using random keys (alphabet-keys, numeric-keys, alpha-numeric
keys) and brute-forcing using all possible numeric-keys. The observations hold strong as a proof of
concept to the proposed statistics with minor variation within tolerable range.
The Proof of Concept (PoC) based on the observation of spurious keys with random
decryption approach on ARC4 algorithm for different keys sizes is illustrated in Table 3.2.b. The
observation is based on the plain texts with text space of 62- alphanumeric characters. The
decryption is carried out in number of random alphanumeric texts. The random alphanumeric keys,
random alphabet-keys and random numeric keys of size 8-bytes and 16-bytes were taken for
random decryptions. Similar Proof of Concept is presented in table 3.2.c which is based on the
observations in 8-byte block ciphers with text size 8-byte. The observation was performed using
random decryption and brute-forcing using 28 possible numeric keys.
S.No.
Text-Size
P.E
8 Bytes key
16 Bytes key
0.2421
0.2423
0.2426
3.44*10-3
3.46*10-3
3.47*10 -3
1.18*10-5
1.09*10-5
1.06*10 -5
Table 3.2.b Proof of Concept based on analysis of ARC4 Algorithm
S.No.
Algorithm
P.E (x*10-5)
Event Distribution
1.18
ARC4
1.08
DES
1.12
DES3
0.99
Blowfish
1.20
Table 3.2.c: PoC based on analysis on different algorithms for 8 bytes text size
PoC (Text Size = 8 bytes)

1.2
variance at 10^-5 range
1
0.8
P.E (x*10-5)
0.6
0.4
0.2
0
Event Distribution ARC4
DES
DES3
Blowfish
Algorithms
Figure 3.2.a: Variance on analysis on different algorithms for 8 bytes text size
3.3.
SPURIOUS KEYS ANALYSIS USING NATURAL LANGUAGE MODELS
3.3.1 INTRODUCTION
The natural languages needs certain encoding standard to feed to modern cryptosystem for
encryption. Utf-8 encoding standard is widely popular as it accepts more then 1 million characters
and leave no change on existing ASCII standard characters. The weight of the character codepoints
of indian languages are 3-bytes. The heavier weight of the characters of indian scripts gives the
possibility of larger universe to encrypted-decrypted texts and thus leading less probability of events
to occur. The probability of events to occur based on different text size for Devanagari script is
shown in table 3.3.a. It is clear that the unicity distance for Devanagari text is approximately 5
which is far less then that of English text.
S.No.
Text-Size
P.E
7.56*10-6
5.73*10-11
3.28*10 - 21
1.08*10 - 41
Table 3.3.a Probability of Event for Devanagari Script implementing UTF-8 Encoding
The code-points associated with the characters can be mapped to a 1byte value for
encryption and decryption which is discussed in sub-topic 3.3.2.
3.3.2 CODE-POINT MAPPING
Code-point mapping is a method proposed in this paper for fair implementation of natural
language models. Basically, code-point mapping techniques works for the languages which has a
maximum of 256 character codepoints associated to the script. The codepoint of a character is a
number in hexadecimal. This codepoint basically carries two information; information of the
language and specificity of the character. For example, the codepoint value of a character 'ka' ' '
is 0x0915. Here '0x09' part of the number remains through out the script and '15' specifies the
character. So, the part ( '15') of the number can be mapped to a 1-byte value ('\x15'). The mapped
value is encrypted and the extracted part ('0x09') can be appended back to the encrypted data to
obtain the cipher. The procedure of encryption and decryption is clearly illustrated in figure 3.3.2a.
Figure 3.3.2a Implementation of Code-point Mapping for Encryption and Decryption
The probability of event for Devanagari text after the implementation of codepoint mapping,
is fairly larger than for English. Table 3.3.2a provides the proof of concept for probability of events
for Devanagari Script. The unicity distance where the probability of event becomes negligible
(less than 2-80) is 82 which is fairly greater compared to English. Table 3.3.2b shows the comparison
between the number of spurious keys for Devanagari, Bengali and English (alphanumeric text) with
respect to different text size. Figure 3.3.2 a is the 3-D bar graph representation of the comparison
which clearly provides two observations; a. The probability of having spurious keys at random
decryption is fairly high to Devanagari script and the probability of the event decreases gradually
with the text size.
S.No.
Text
P.E
ARC4
DES
0.0606
0.0603
3.7*10 -3
3.6*10-3
3.69*10-3
16
1.34*10 -5
1.3*10 -5
1.62*10 -5
32
1.81*10-10
__
__
Table 3.3.2a PoC of P.E for Devanagari texts based on analysis using ARC4 and DES
algorithms
S.No.
Text Size
Devanagari
Bengali
English
0.0606
0.0167
3.4*10-3
3.7*10-3
2.8*10-4
1.2*10-5
16
1.3*10-5
7.7*10-8
1.4*10-10
32
1.8*10-10
6.0*10-15
3.3*10-20
Table 3.3.2.b Comparative analysis of spurious keys between Devanagari, Bengali and
English(alphanumeric)
Analysis of Spurious keys

Devnagari vs English
0.07
0.06
Probability of Event
0.05
Devanagari
English
0.04
0.03
0.02
0.01
0
4
16
Text Size
Figure 3.3.2.a Comparative analysis of spurious keys between Devanagari and English
CHAPTER 4
TEXT-SPACE BOUNDARY
IN
STREAM CIPHERS
4. TEXT-SPACE BOUNDARY IN STREAM CIPHER
4.1 INTRODUCTION
Stream ciphers is the practical application of one-time-pad. It consist of a Pseudo Random
Generator (PRG) which takes the supplied key as a seed to the pseudo random generator. The PRG
generates a long bit sequence which is equal to the text to be encrypted. The encryption is simply
the one-time-pad of the text with generated bit stream. The one-time-pad holds a property with
respect to the text space which can be illustrated with simplicity by considering a text-space with
only two element '\x00' and '\x01' ie. U= {\x00,\x01}n where n is the size of text. The encrypted and
decrypted texts can also be bounded to the same space-limit by applying MOD-2 operation to the
code-points after XOR. This scenario is clearly shown with comparative illustration on existing
model in figure 4.1.a and figure 4.1.b. We can clearly see that the event is likely to occur at every
decryption irrespective of text size for the proposed model. The property holds true for text space
of with set of elements of size 2x where x= 1 to 8 where MOD-2x. is applied to bound the encrypteddecrypted texts.
\x00\x01\x00....
\x00\x01\x00....
ARC4ENC
ARC4DEC
......
......
K
Figure 4.1.a Simple Block Diagram of Stream Cipher for {\x00,\x01}n text-space
\x00\x01\x00....
\x00\x01\x00...
%2
ARC4ENC
%2
ARC4DEC
\x01\x01\x00...
\x01\x01\x00......
K
Figure 4.1.b Stream Cipher with MOD 2 operation block for {\x00, \x01}n text space
4.2 MATHEMATICAL MODELING:
Let us define a simple model simple encryption and decryption model for stream cipher by
equation (1) and (2) where m,c,d,k are respectively message, cipher, decrypted text, key and E(),
D() be the encryption and decryption algorithm function which takes two arguments as given.
c = E(m,k) ---------------(1)
d= D(c,k) ------------------(2)
For stream cipher E(m,k) and D(c,k) may be defined as below
E(m,k) = mb OTP Kb --------(3)
where Kb =G(k) and G() is a Pseudo Random Generator which takes key k and generates Kb
D(c,k) = cb OTP Kb -------------- (4)
This implies,
D(E(m,k),k) = m -------------------(5)
A model is proposed in this thesis which is mathematically defined with limitation of boundary to
text space. If a plain text of size n-byte be defined by
m, where m U={\x00,\x01 . }n and size of the set {\x00,\x01 . } be given by 2p , p
{1,2,...8} , then
c=E'(m,k) = [ (x MOD 2p) : x = byte-value for each byte in E(m,k) ] --------(6)

D'(c,k) = [ (x MOD 2p) : x = byte-value for each byte in D(c,k) ] -----------(7)
case-1:
For p =8, E'(m,k) = E(m,k) ----(8)
D'(m,k)=D(m,k) ------(9) and hence, the model replicates the original stream cipher
system with p=8
case-2:
For p=0, E'(m,k) = D'(m,k) -------(10) and hence, this does not follow general rule of crpytography
The probability of collision increases with smaller value of p along with the condition that
with larger text size (n) the probability of collision is depreciated. Thus, with possible tradeoffs this
model can be implemented to attain large spurious keys and unicity distance.
4.3 CODE-POINT MULTIMAPPING:
The code point mapping approach to cryptography is mentioned in chapter 3, for

implementation of natural language model. The use of boundary to text space in stream ciphers
along with this approach is basically dealt in codepoint multimapping. The code-point
multimapping holds effective for language model with less then 128 characters and so is acceptable
to English and Indic languages. Basically, it is found that the language characters maps either to
upper half or the lower half of the 256 1-byte value. So, the idea of codepoint multimapping is to
map the code-points twice to both the halves symmetrically and then apply encryption limiting the
text-space boundary to half by implementing MOD 128 operation on stream ciphers as explained in
topic 4.2. The universe sinks to half leveraging the probability of having more text-like-texts at
random decryptions. The effect on text space with encryption and decryption model of codepoint
multimapping is illustrated in the figure 4.3.a and 4.3.b respectively.
Figure 4.3.a Encryption Model with Code-point Multimapping
Figure 4.3.b Decryption model with code-point Multimapping
Table 4.3.a illustrates the comparison of probability of event of grneral ARC4 stream cipher
and ARC4 with boundary limited by MOD 128 operation . It is noticed that the number of spurious
keys is increased as the boundary of encrypted decrypted text space shrinks to half.
S.No
Text-Size
S.C. Without MOD 128
S.C. With MOD 128
1.2*10-5
3.0*10-3
16
1.4*10-10
9.2*10-6
32
1.96*10-20
8.4*10-11
64
__
7.1*10-21
Table 4.3.a. Comparatively analysis of Probability of Events in Stream Cipher with and
without MOD 128 operation
4.3.1 ANALYSIS USING NATURAL LANGUAGE MODEL:
The implementation of codepoint multimapping in stream ciphers result in fairly larger

unicity distance. It is observed that probability of events to occur is fair with larger text size like
256 characters and 512 characters. Table 4.3.1.a presents a comparative analysis of probability of
events for codepoint mapping and code point multi mapping in stream cipher. The table is is
illustrated with 3-D bar graph model in figure 4.3.1.b where we can easily see the leverage of
probability of events in case of code-point multimapping.
S.No
Text-Size
Codepoint Mapping
(ARC4)
Codepoint Multimapping
(ARC4)
3.7*10-3
0.94
16
1.3*10-5
0.88
32
1.8*10-10
0.78
64
3.3*10-20
0.60
128
__
0.37
256
__
0.13
Table 4.3.1.a: Comparative Analysis between Encryption Model implementing code-point

mapping and Code-point Multimapping for Devanagari text.
Stream cipher with MOD-128 for Devanagari

1
0.9
0.8
0.7
P.E
0.6
S.C. Without MOD-128

S.C. With MOD-128
0.5
0.4
0.3
0.2
0.1
0
8
16
32
64
Text-Size
Table 4.3.1.a: Comparative Analysis between Encryption Model implementing code-point

mapping and Code-point Multimapping for Devanagari text.
4.4 INDEX MAPPING
Index mapping is an approach of implementation of stream cipher with boundary to text

space. The implementation of boundary to text in stream cipher allowed anly ist 2P bytes ,p
{1,2...8}, to appear in text space. For example for for p = 2, only ist 22 bytes i.e '\x00', '\x01' ,
'\x02', '\x03' are allowed elements to text-spaces. These number of bytes can be mapped to the
indices of equal set of desirable characters and implemented encryption. An example for p=3 is
presented here.
Let the set of desirable characters be A=['1','2','3','4','5','6','7','8'] now the indices mapping of
elements to the first 23 8 bytes bounded to text-space is shown below:

'1' '\x00' ,
'2' '\x01' ,
'3' '\x02 ' ,
'4' '\x03' ,
'5' '\x04'
'6' '\x05' ,
'7' '\x06' ,
'8 '\x07'
Now, the mapped byte-value is encrypted and decrypted using stream cipher with the
boundary to text-space and the cipher texts and decrypted texts are simply characters
corresponding to indices retrieved by reverse mapping of the bytes received after encryption and
decryption with MOD operation. The implementation of the same is illustrated in figure 4.4.a. In
figure 4.4.b , the implementation of index mapping with MOD 32 operation and 32 character set is
presented.
Figure 4.4.a Index Mapping with text space of 8 characters
Figure4.4.b: Index Mapping with text space of 32 characters
The index mapping provides boundary to the encrypted and decrypted text-space so that
every cipher text or decrypted text will have elements belonging to the character space. This
increases the possibility of having numbers of meaningful messages to one cipher text as illustrated
in the figure 4.4.c.
Figure 4.4.c: Spurious Keys Analysis with Index Mapping of 32 Characters
4.5 MODEL OF ENCRYPTION FOR NUMERIC-STRINGS
The numeric string or numeric text is composed of 10 characters belonging to set

string.digits i.e. ['0','1','2','3','4','5','6','7','8','9']. An encryption scheme with 8 numeric characters set
with boundary to text space is mentioned in the earlier topic. There are total
numeric characters set. The text space string.digits can be realized as
10
C8 combinations of 8
10
C8 =45 unique text-spaces
with 8-numeric characters. The implementation of 45 encryptions with boundary to those 45 textspaces would result an equivalent encryption to numeric strings with encrypted-decrypted texts
bounded to string.digits. Figure 4.5.a shows the implementation of numeric string encryption with
boundary. The plain text applied to encryption is the list of phone numbers reported on March 2014.
Figure 4.5.a: Numeric String Encryption with Text Space Boundary
Now, the encryption model described is very difficult to mark or brute-force as each
decryption will leads to a text-like-text as output. The partial encryption of texts can be helpful in
leveraging confusion which is illustrated in figure 4.5.b and figure 4.5.c
Figure 4.5.b: Partial Encryption of text using Numeric-String Encryption Model
(i)
(ii)
(iii)
figure 4.5.c Data Frame Encryption using Numeric-String Encryption Model (i) Plain Text,
(ii) Cipher Text, (iii) Decrypted Text
CHAPTER 5
CONCLUSIONS
5. CONCLUSIONS
The analysis of spurious keys provides a vision of strengthening cryptography by leading

cryptosystems beyond bruteforce bound. Some of the points were observed with statistics drove by
an approach of random and brute-forced decryptions as a proof of concept in this research work.
The analysis was done in modern cryptographic algorithms which include both block ciphers(DES,
Blowfish) and stream cipher(ARC4).
The probability of having spurious keys during random decryptions depends on the text-size and
text-space (character set associated to the language used).
Considering keys associated with the decrypted texts with elements within the character set of
plain text space, as spurious keys, the probability of spurious keys to occur at random decryption
for 8-byte texts with text space, 26 English alphabets is 10 -8 whereas for 8 syllables text from
Devanagari Script text is 3*10-3 when code-point mapping is implemented for encryption.
The probability of event (text like text at random decryption) decreases gradually with text size
and is negligible at certain point which gives unicity distance. Considering 2 -80 as a negligible
probability, the unicity distance for 26 alphabet text is approximately 27 characters while that for
Devanagari text round offs to 81 characters.
The implementation of code-point multimapping hold very effective to Devanagari Script
providing fair probability of spurious keys to longer texts of size 256, 512 and even more.
The cryptanalysis of One Time Pad using multiple plain texts cipher texts pairs indicated that
there is a possibility of bounding encrypted-decrypted texts to certain number of character space in
stream ciphers. An encryption model is designed based on the property of OTP discovered for
numeric string for which the unicity distance tends to infinity with a limit that only 10 numeric
characters valid to encryption.

Report PDF

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Report PDF

Hochgeladen von

Copyright:

Verfügbare Formate

CONTENTS

Information Theoretic Approach

CHAPTER 3 SPURIOUS KEYS AND UNICITY DISTANCE

Spurious keys Analysis

Spurious Keys Analysis Using Natural Language Model

Code Points Mapping Technique

CHAPTER 4 TEXT-SPACE BOUNDARY IN STREAM CIPHERS

Spurious Keys Analysis Using Natural Language Model

A Security Model for Numeric-String

Cryptography is used everywhere for secured communication. It basically implements

The characteristics of the encrypted/decrypted texts is a matter of concern regarding modern

model compared to English.

2.2 MODERN CRYPTOSYSTEM

2.3 INFORMATION THEORETIC APPROACH

cryptosystem which is mentioned in point below:

3. SPURIOUS KEYS AND UNICITY DISTANCE

Shannon proposed information theoretic approach in his paper, Communication Theory of

SPURIOUS KEYS ANALYSIS

Table 3.2.a: Probability of Event for different English text-size

which is a negligible value

Table 3.2.b Proof of Concept based on analysis of ARC4 Algorithm

PoC (Text Size = 8 bytes)

variance at 10^-5 range

SPURIOUS KEYS ANALYSIS USING NATURAL LANGUAGE MODELS

3.3.2 CODE-POINT MAPPING

Figure 3.3.2a Implementation of Code-point Mapping for Encryption and Decryption

Analysis of Spurious keys

4. TEXT-SPACE BOUNDARY IN STREAM CIPHER

4.2 MATHEMATICAL MODELING:

D(c,k) = cb OTP Kb -------------- (4)

c=E'(m,k) = [ (x MOD 2p) : x = byte-value for each byte in E(m,k) ] --------(6)

4.3 CODE-POINT MULTIMAPPING:

The code point mapping approach to cryptography is mentioned in chapter 3, for

Figure 4.3.a Encryption Model with Code-point Multimapping

Figure 4.3.b Decryption model with code-point Multimapping

S.C. Without MOD 128

S.C. With MOD 128

4.3.1 ANALYSIS USING NATURAL LANGUAGE MODEL:

The implementation of codepoint multimapping in stream ciphers result in fairly larger

Table 4.3.1.a: Comparative Analysis between Encryption Model implementing code-point

Stream cipher with MOD-128 for Devanagari

S.C. Without MOD-128

Table 4.3.1.a: Comparative Analysis between Encryption Model implementing code-point

4.4 INDEX MAPPING

Index mapping is an approach of implementation of stream cipher with boundary to text

elements to the first 23 8 bytes bounded to text-space is shown below:

'3' '\x02 ' ,

Figure 4.4.a Index Mapping with text space of 8 characters

Figure4.4.b: Index Mapping with text space of 32 characters

Figure 4.4.c: Spurious Keys Analysis with Index Mapping of 32 Characters

4.5 MODEL OF ENCRYPTION FOR NUMERIC-STRINGS

The numeric string or numeric text is composed of 10 characters belonging to set

C8 =45 unique text-spaces

Figure 4.5.a: Numeric String Encryption with Text Space Boundary

Figure 4.5.b: Partial Encryption of text using Numeric-String Encryption Model

The analysis of spurious keys provides a vision of strengthening cryptography by leading

Das könnte Ihnen auch gefallen