12 views

Uploaded by nthung1

- Debian-Ubuntu Hardening Guide
- Cracking Password Hashing Schemes Using Graphics Processing Units - Martijn Sprengers Plan
- Node.js Manual & Documentation
- Macy
- SSRN-id2519367-Japan-Improved-Bitcoin-IBC
- AES Algorithm
- Security Requirements and Attacks
- Du 23721724
- A Comparative Analysis Of
- Forensics - Computer Forensics Guide 11-01
- Language Identification of Encrypted VoIP Traffic
- Rar
- 100301
- 3b Ipsec VPN
- Security
- Case Law on Electronic Evidence
- VBA Signing Guide
- Flask Security
- Change Log
- TD Blockchain 2017 2018 Corrige

You are on page 1of 9

hashing algorithm

Brian Murray

ISSM 533

Table of Contents

Executive Summary............................................................................................................. ..........................3

Summary of MD5....................................................................................................................................... .....3

Definitions............................................................................................................................

......................3

Input data................................................................................................................................

....................3

Step 1: Data padding........................................................................................................ ..........................3

Step 2: Appending length............................................................................................................. ..............3

Step 3: Initialize the Message Digest buffer.............................................................................. ................3

Step 4: Process input........................................................................................................ ..........................4

Step 5: Output.............................................................................................................................................6

T Value Table........................................................................................................................... ..................6

Technical Vulnerabilities........................................................................................................ ........................7

Signature attack with a hidden message........................................................................ ............................7

X.509 Certificate Signature attack........................................................................................... ..................7

Conclusions............................................................................................................................

.........................8

References.............................................................................................................................

..........................9

Executive Summary

In this paper, the MD5 hashing algorithm is discussed. MD5 is a common hashing algorithm used in

many cryptographic schemes. It is also used for file verification for file downloads. Part one of this paper

outlines how MD5 works. There have been quite a few recent discussions on whether or not MD5 is still

a solid algorithm. Some less informed have stated that MD5 has been completely broken. Part two of this

paper investigates the current state of MD5 and its attack vectors. In the final section, this paper outlines

conclusions based on section two.

Summary of MD5

Definitions

A word is a 32 bit group.

A byte is an 8 bit group.

'b' represents the length of the input data in bits.

'|' is a bitwise OR. '&' is a bitwise AND. '^' is a bitwise XOR. '~' is a bitwise NOT.

X <<< s is a circular bit shift of X, by s bit positions.

Input data

The data input into MD5 can be any number of bits. They do not need to fall on a 8 bit boundary. It may

also be 0 bits long.

The input data is first padded so that it is 64 bits short of a 512 bit boundary. This means that the number

of bytes should be congruent to 448, modulo 512. The data is padded first by a binary one, then by as

many zero's as needed to complete the padding, 64 bits short of a 512 boundary.

The length of the message 'b' is appended to data in step 1. This is the number of input bits before

padding. In the case that the length's representation is larger then 64 bits long, only the low order bit

representation is added. The data will now be an exact multiple of 512. The data is also capable of being

broken down into an even multiple of 16, 32 bit words. These are denoted as M[0 .. N-1], where N is a

multiple of 16.

Four, 1 word buffers are used. They are initialized with the low order bytes first. Here they are

represented in hex form. These values are loaded in as the default initialization vector. Note, C and D are

just reverses of B and A respectively.

A: 01 23 45 67

B: 89 ab cd ef

C: fe dc ba 98

D: 76 54 32 10

To start with this phase, 4 functions must be defined. Each of these functions take in three, 32bit words,

and output one, 32 bit word.

F(A,B,C) = (A & B) | (~A & C)

G(A,B,C) = (A & C) | (B & ~C)

H(A,B,C) = A ^ B ^ C

I(A,B,C) = B ^ (A | ~C)

G is very similar, as it is almost identical. If C, then A, else B.

H is effectively just a parity function of the 3 inputs.

In all 4 of the functions, each bit is unbiased, and independent, such that no other bit affects it, aside from

its counterparts from the other inputs.

Next, a table is constructed of 64 elements. Their values are: 4294967296 times abs(sin(i)), where i is the

number of the element, and in radians. The table is numbered T[1 .. 64]. So, T[1] = 0xD76AA478. These

can be calculated in each round on the fly, but since they never change, it is most efficient to simply

statically code them into the end code, as was done in the reference section of RFC 1321.

For i = 0 .. N/16-1

For j = 0 .. 15

X[j] = M[i*(j+16)]

Next, we need to create a copy of the data from A,B,C,D, since we need all of this data later.

AA = A, BB = B, CC = C, DD = D

Now we perform the calculations in 4 rounds, changing the function used each time. So round 1 uses

function F, round 2 uses function G, and so on. Unfortunately, these are so diverse that it is simply easier

to code them statically, instead of via looping mechanisms.

The following page is taken from RFC 1321:

/* Round 1. */

/* Let [abcd k s i] denote the operation

a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */

/* Do the following 16 operations. */

[ABCD 0 7 1] [DABC 1 12 2] [CDAB 2 17 3] [BCDA 3 22 4]

[ABCD 4 7 5] [DABC 5 12 6] [CDAB 6 17 7] [BCDA 7 22 8]

[ABCD 8 7 9] [DABC 9 12 10] [CDAB 10 17 11] [BCDA 11 22 12]

[ABCD 12 7 13] [DABC 13 12 14] [CDAB 14 17 15] [BCDA 15 22 16]

/* Round 2. */

/* Let [abcd k s i] denote the operation

a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */

/* Do the following 16 operations. */

[ABCD 1 5 17] [DABC 6 9 18] [CDAB 11 14 19] [BCDA 0 20 20]

[ABCD 5 5 21] [DABC 10 9 22] [CDAB 15 14 23] [BCDA 4 20 24]

[ABCD 9 5 25] [DABC 14 9 26] [CDAB 3 14 27] [BCDA 8 20 28]

[ABCD 13 5 29] [DABC 2 9 30] [CDAB 7 14 31] [BCDA 12 20 32]

/* Round 3. */

/* Let [abcd k s t] denote the operation

a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */

/* Do the following 16 operations. */

[ABCD 5 4 33] [DABC 8 11 34] [CDAB 11 16 35] [BCDA 14 23 36]

[ABCD 1 4 37] [DABC 4 11 38] [CDAB 7 16 39] [BCDA 10 23 40]

[ABCD 13 4 41] [DABC 0 11 42] [CDAB 3 16 43] [BCDA 6 23 44]

[ABCD 9 4 45] [DABC 12 11 46] [CDAB 15 16 47] [BCDA 2 23 48]

/* Round 4. */

/* Let [abcd k s t] denote the operation

a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */

/* Do the following 16 operations. */

[ABCD 0 6 49] [DABC 7 10 50] [CDAB 14 15 51] [BCDA 5 21 52]

[ABCD 12 6 53] [DABC 3 10 54] [CDAB 10 15 55] [BCDA 1 21 56]

[ABCD 8 6 57] [DABC 15 10 58] [CDAB 6 15 59] [BCDA 13 21 60]

[ABCD 4 6 61] [DABC 11 10 62] [CDAB 2 15 63] [BCDA 9 21 64]

Finally, we mathematically add the previous word values to the end word values.

A = A + AA

B = B + BB

C = C + CC

D = D + DD

Step 5: Output

We are finally left with an output of 4 words. A is the low order word, and D is the high order word.

From here, we can simply print them.

T Value Table

The following table is the numbers that are to be used in step 4 as the values of T. The formula for the

values is: 4294967296 times abs(sin(i)), where i is the number of the element, and is in radians.

698098D8 8B44F7AF FFFF5BB1 895CD7BE 6B901122 FD987193 A679438E 49B40821

F61E2562 C040B340 265E5A51 E9B6C7AA D62F105D 02441453 D8A1E681 E7D3FBC8

21E1CDE6 C33707D6 F4D50D87 455A14ED A9E3E905 FCEFA3F8 676F02D9 8D2A4C8A

FFFA3942 8771F681 6D9D6122 FDE5380C A4BEEA44 4BDECFA9 F6BB4B60 BEBFBC70

289B7EC6 EAA127FA D4EF3085 04881D05 D9D4D039 E6DB99E5 1FA27CF8 C4AC5665

F4292244 432AFF97 AB9423A7 FC93A039 655B59C3 8F0CCC92 FFEFF47D 85845DD1

6FA87E4F FE2CE6E0 A3014314 4E0811A1 F7537E82 BD3AF235 2AD7D2BB EB86D391

Technical Vulnerabilities

Currently, there is only one simple vulnerability with MD5. It is based on the ability to change just a few

bits of a set of seemingly random data. Two sets of data producing the same MD5 hash is called a

'Collision'. These collisions can be used in a few different ways, which are noted later.

There are a number of ways to generate these collisions. The most recent of them is called “Tunneling”,

by Vlastimil Klima. In his paper, he describes a method to find these collisions in under a minute. A

Pentium 4, 3.2GHz is capable of finding these collisions, on average, in 17 seconds. His method also

applies to other hashing algorithms, including SHA-1.

To date, there has been no quick attacks against a specific hash. IE, one cannot turn a specific string into

a given hash. However, this does allow for other attacks.

One specific attack is used against signed documents. The attack requires a programed language, such as

postscript, to be carried out. The attack starts out by finding a colliding prefix for the file. Since MD5(A)

== MD5(B) in a collision, MD5(A + M) == MD5(B + M), meaning that we can append anything, and the

MD5 will be the same, so long as the appended text is the same. In the prefix, we would set a variable to

be one of the colliding two MD5's. Later, in the appended text, we would check if the variable was the

first of the two colliding values. If so, then output one thing. Otherwise, output the second thing. For

example, you would have 2 messages within the body of the message. One would state 'thank you for

your contributions', and the second message would be a message stating that you should be given full

access to all resources. You would then have someone trusted sign the message, seeing the first of the

two, such as the security manager or other authority. Then, you would change the colliding text to trigger

the second of the two messages. The recipient would see the message stating you should be given full

access, with a valid signature.

A second attack against MD5 has been with X.509 certificates. The method was described in 2005. First,

one starts out by creating all of the Certificate Signing Request, without the public key. The data before

the public key modulus must be on a 64 byte boundary. Adding some information after the

Distinguishing name will serve to pad out the data to ensure the 64 byte boundary. Also, the byte lengths

of the modulus and public key exponent must be a fixed length. The MD5 algorithm is run, and we are

left with its output being the IV for the next section. Since the data is exactly on a 512 MD5 block

boundary, no padding is done, which allows for us using the output as an IV for the next section.

Remember, the certificate thus far must be X509 compatible, otherwise it will be rejected by the

certificate authority. Next, we can use either Xiaoyun Wang et al., or the tunneling method, to create 2

different, but similar messages that produce the same MD5 based on the initialization vector used. In the

demonstration used, the public exponent was 65537. This number must be the same for both certificates.

Next, a p1,p2,q1,q2 are found with the help of a single, common value that gets appended to the colliding

value previous. This will yield 2 public keys with the same MD5, as well as 2 separate private keys. What

this means, is that 2 certificates can contain the same signature.

Conclusions

Although there have been 2 attacks against MD5, it is my belief that MD5 is still a completely valid

algorithm. Both of these attacks are very case specific.

For instance, the attack against Alice's boss requires pre-existing malicious code to be inserted before the

signature is taken. Signing other documents, such as a PDF, make it impossible to carry out such an

attack. In the case of signing code using the same method, it would require the attacker to implement the

malicious code upstream, which would require a code maintainer to sign off on the change. At that point,

they may as well just add a separate command line switch, or watch if a file exists to trigger the attack, as

it would be much simpler.

In the case of the X509 certificates, it in fact has the opposite effect. First, the original certificate owner

must perform the attack before the certificate is signed, very much like the Alice's boss type of attack.

However, if ever a duplicate signature is found, one can use the same method as described by Lensta, A.

et al., to reverse engineer the certificate, and provide the original private key and modulo. This, in effect,

defeats the security behind the certificate in the first place, and leaves the original attacker very open to

an attack back on them.

To date, there has been no attack against MD5 for creating files that are of identical MD5's, even though

their contents vastly differ. Alice's boss types of attacks require an attacker to prepare the malicious code,

and for the signer to forgo due-diligence in checking the contents. In the X509 certificate attack, it simply

allows for 2 different, but similar certificates to be created. However, if it is not known that a second

certificate exists, then it is impossible for a relying party to know if the receiving party is the actual

recipient. Of course, the attacker would have needed to give the certificate to the other party, and if that is

the case, then the attacker may as well just give the data to the other party freely as well.

In my opinion, both of these attacks require a breakdown in other systems to make these attacks feasible.

The only MD5 attack that is feasible is against passwords, where brute forcing becomes a possibility.

Without any current method to 'pick' a resultant MD5, either based on a IV or not, MD5 is still a valid

method for verifying data.

References

Rivest, R., “The MD5 Message_Digest Algorithm”, RFC 1321, MIT and RSA Data Security, Inc., April

1992

Klima, V., “Tunnels in Hash Functions: MD5 Collisions Within a Minute”, April 2006

Daum, M. & Lucks, S., “Attacking Hash Functions by Poisoned Messages "The Story of Alice and her

Boss"”, http://www.cits.rub.de/MD5Collisions/, June 2005

Lensta, A., Wang, X., Weger, B., “Colliding X.509 Certificates”, March 2005

- Debian-Ubuntu Hardening GuideUploaded byTeNeX
- Cracking Password Hashing Schemes Using Graphics Processing Units - Martijn Sprengers PlanUploaded by12pv0c+doa464j87ms30
- Node.js Manual & DocumentationUploaded byTim TW
- MacyUploaded byManish Chaturvedi
- SSRN-id2519367-Japan-Improved-Bitcoin-IBCUploaded byCoinDesk
- AES AlgorithmUploaded byEswin Angel
- Security Requirements and AttacksUploaded byiwc2008007
- Du 23721724Uploaded byAnonymous 7VPPkWS8O
- A Comparative Analysis OfUploaded byMajid Khan
- Language Identification of Encrypted VoIP TrafficUploaded byFarhan Sarwar
- RarUploaded byKane Ranger
- 3b Ipsec VPNUploaded byMark Brown
- Forensics - Computer Forensics Guide 11-01Uploaded byPriyam Dutta
- 100301Uploaded byvol2no3
- SecurityUploaded byHarish Taware
- Case Law on Electronic EvidenceUploaded byynna
- VBA Signing GuideUploaded byDavid Fi
- Flask SecurityUploaded bydavid81brs
- Change LogUploaded byVictor Aldana
- TD Blockchain 2017 2018 CorrigeUploaded byzied
- IJRDET_0414_23Uploaded byMoses Kabete
- Blockchain suitability for governmentUploaded bysrinath
- C80216e-04_480Uploaded byAbdourahmane
- 1131 Form11LLP HelpUploaded byAnand Jituri
- Electronic Transaction Act SingaporeUploaded byMc Alaine Ligan
- Signature Encryption Organization Printable v3.1Uploaded byprit dhingani
- A Practical Digitam Multisignature Scheme Based on Discrete.psUploaded byΟλυμπίδης Ιωάννης
- Parallel Identities for Managing Open Government DataUploaded byJaideep Jai
- Duplicate Cleaner LogUploaded byIoanus
- cd3smith2005Uploaded byVineeth Kp

- Master of Lean ITUploaded byBudiarto Gouw
- Rouz hill brosureUploaded bystefan
- Informatica 9.1 CommandReferenceUploaded byPC
- R19 Thermal Stress Analysis of Fused-Cast AZS Refractories During Production - Industrial Study 1994Uploaded bypetember
- BizOptix InstructionUploaded byMinh Vu
- Anthro for CrecheUploaded bySrishti Singhal
- Panel Data Econometrics in RUploaded byCharlene Silva
- SSESL 2018- Jayasinghe & NanayakkaraUploaded byKalindaMadusankaDasanayaka
- JKE HandbookUploaded byZhess Bug
- Draft munaUploaded byIsagani Dulfo Jr.
- NSM Sample AgreementUploaded bySteven McGrath
- An Extended Privacy Calculus Model for E-Commerce Transactions.,_ISR,_2006Uploaded bydburnama
- Prevention of corrosion of reinforcing bars in concrete.pdfUploaded byEnanko Mazumder
- Syllabus Inxhinieri e WebitUploaded byenco123enco
- Seminar BooksUploaded bytejashs2011
- table of contentsUploaded byShahdura Hammad Thauri
- lansiaUploaded byeko anggoro
- Gender Mainstreaming in India -fUploaded bybrsiwal1475
- Project Report on Recruitment and Selection by AEGISUploaded byChandan Srivastava
- writing an effective research paperUploaded byapi-430587053
- Energy Transfer AUploaded byCoolman Poon
- NARSUploaded byAbhishek Mandal
- 2012 Nest Dndc1Uploaded byfelixbossio
- The Shapes of Color SchemesUploaded bycownessforever
- Radiation Protection Bushong Study GuideUploaded byAsha6842
- Cons, Benefits and Selection Criteria for Virtual NetworkUploaded byవిశ్వేష్ నాగమల్ల
- The Use of Information and Communication Technology and Social Networking Sites in Political Governance of East African Legislative Assembly ParliamentUploaded byIJSTR Research Publication
- Midterm Lesson 3 User Interface DesignUploaded byJohn Melgar
- administrative assistant or secretaryUploaded byapi-76950191
- rorymccaffreyresumeUploaded byapi-453396285