Hak5
Save 10% at GoDaddy.com with coupon code HAK

MD5

From Hak5

Jump to: navigation, search


MD5 or Message-Digest algorithm 5 is a widely-used cryptographic hash function with a 128-bit hash value. As an Internet standard, MD5 has been employed in a wide variety of security applications, and is also commonly used to check the integrity of files. MD5 is an exaple of one-way cryptography.

Contents

Cryptography

--spektormax MD5 processes a variable length message into a fixed-length output of 128 bits. First the Input is split up into 512bit chunks. The last chuck is padded up to 512 bits. First a single 1 bit is appended to the input, then 0's are appended until the messages length is 448bits next a 64-bit integer representing the length of the original message is appended to the message. If the Message's length is greater than 64-bit's, the 64 most significant bits are left.

The main MD5 algorithm operates on a 128-bit state, divided into four 32-bit words, denoted A, B, C and D. The main algorithm then operates on each 512-bit message block in turn, each block modifying the state. The processing of a message block consists of four similar stages, termed rounds; each round is composed of 16 similar operations based on non-linear functions F,G,H,I described below.

    F(X,Y,Z) = (X AND Y}) OR (NOT(X) AND Z)
    G(X,Y,Z) = (X AND Z}) OR (Y AND NOT(Z))
    H(X,Y,Z) = X XOR Y XOR Z
    I(X,Y,Z) = Y XOR (X OR NOT(Z))
    

First A,B,C,D are set to fixed constants as shown below in Hex (base-16).

A = 0x67452301
B= 0xEFCDAB89
C= 0x98BADCFE
D= 0x10325476

We also create a 4 by 16 array of the following numbers and call it r:

r[ 0..15] = {7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22} 
r[16..31] = {5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20}
r[32..47] = {4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23}
r[48..63] = {6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21}

Note that r is random so it has to be written out in full, there is a second array k which can be calulated as follows: k[i] := floor(abs(sin(i + 1)) × 2^32) i being between 0 and 63 (again its for the 64 operations rember computers count from 0 :)) Next each 512-bit chuck is further broken down into 16 32-bit chunks. Next 64 total operations are performed (4 rounds X 16 32-bit chunks). for the first round f = (b and c) or ((not b) and d) we then set g to number of the operation (1-64) for each of the 64 total operation we do temp storage = d d = c c = b b = a + f + k[operation nummber] + g we then rotate b r[operation number] bits to the left and add that to the original value of b and store it in b a = temp storage of d this is repeated for all 16 chuncks. Next round 2 is performed 16 times once on each chunck as follows: f = (d and b) or ((not d) and c) g = (5×operation_number + 1) modulus 16 again we perform the same operation as we did at the end of each of the 16 chunks in round 1 (temp storage = d,d = c...) Next for round 3 we do: f = b xor c xor d g = (3×operation_number + 5) modulus 16 and again after each operation we do the whole shift stage. Round 4: f = c xor (b or (not d)) g = (7×operation_number) modulus 16 yet again we do the shift. each time we do an operation we add its results to the sums of the previous operations (we modulus everything by 2^32) we then append a to b to c to d and there by get our MD5 hash value.

Sample code

(thanks to wikipedia)

//Note: All variables are unsigned 32 bits and wrap modulo 2^32 when calculating

//Define r as the following
var int[64] r, k
r[ 0..15] := {7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22} 
r[16..31] := {5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20}
r[32..47] := {4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23}
r[48..63] := {6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21}

//Use binary integer part of the sines of integers as constants:
for i from 0 to 63
    k[i] := floor(abs(sin(i + 1)) × 2^32)

//Initialize variables:
var int h0 := 0x67452301
var int h1 := 0xEFCDAB89
var int h2 := 0x98BADCFE
var int h3 := 0x10325476

//Pre-processing:
append "1" bit to message
append "0" bits until message length in bits ≡ 448 (mod 512)
append bit length of message as 64-bit little-endian integer to message

//Process the message in successive 512-bit chunks:
for each 512-bit chunk of message
    break chunk into sixteen 32-bit little-endian words w(i), 0 ≤ i ≤ 15

    //Initialize hash value for this chunk:
    var int a := h0
    var int b := h1
    var int c := h2
    var int d := h3

    //Main loop:
    for i from 0 to 63
        if 0 ≤ i ≤ 15 then
            f := (b and c) or ((not b) and d)
            g := i
        else if 16 ≤ i ≤ 31
            f := (d and b) or ((not d) and c)
            g := (5×i + 1) mod 16
        else if 32 ≤ i ≤ 47
            f := b xor c xor d
            g := (3×i + 5) mod 16
        else if 48 ≤ i ≤ 63
            f := c xor (b or (not d))
            g := (7×i) mod 16
 
        temp := d
        d := c
        c := b
        b := ((a + f + k[i] + w(g)) leftrotate r[i]) + b
        a := temp

    //Add this chunk's hash to result so far:
    h0 := h0 + a
    h1 := h1 + b 
    h2 := h2 + c
    h3 := h3 + d

var int digest := h0 append h1 append h2 append h3 //(expressed as little-endian)

Cracking MD5

Methods

The main weakness is the it's length. With only 2^128 possible combinations of hashes it becomes weak when it repersents several 100 megabyte files. This is because for every 1 hash there are file_size divided by 128 number of possible files that could repersent the same hash. It is thereby crackable based on the Birthday_Paradox. The idea is that if you were to keep trying to hash values, eventually you would get a collision where 2 seemingly different inputs give you the same hash. This is also true because no matter the size of the input, the algarithum treats it as 1 512 bit input. Example of collision: Suppose the attacker (Charlie) discovers that the message "I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005." has the same hash as "I, Bob, agree to pay Charlie $18542841.54 on 9/27/2012." Charlie could then try to get Bob (the victim) to digitally sign the first message (e.g., by purchasing $5000 of goods). Charlie would then claim that Bob actually signed the second message, and "prove" this assertion by showing that Bob's signature matches the second message.

From the Web

On Your Box