MD5
From Hak5
MD5 or Message-Digest algorithm 5 is a widely-used cryptographic hash function with a 128-bit hash value. As an Internet standard, MD5 has been employed in a wide variety of security applications, and is also commonly used to check the integrity of files. MD5 is an exaple of one-way cryptography.
Contents |
Cryptography
--spektormax MD5 processes a variable length message into a fixed-length output of 128 bits. First the Input is split up into 512bit chunks. The last chuck is padded up to 512 bits. First a single 1 bit is appended to the input, then 0's are appended until the messages length is 448bits next a 64-bit integer representing the length of the original message is appended to the message. If the Message's length is greater than 64-bit's, the 64 most significant bits are left.
The main MD5 algorithm operates on a 128-bit state, divided into four 32-bit words, denoted A, B, C and D. The main algorithm then operates on each 512-bit message block in turn, each block modifying the state. The processing of a message block consists of four similar stages, termed rounds; each round is composed of 16 similar operations based on non-linear functions F,G,H,I described below.
F(X,Y,Z) = (X AND Y}) OR (NOT(X) AND Z)
G(X,Y,Z) = (X AND Z}) OR (Y AND NOT(Z))
H(X,Y,Z) = X XOR Y XOR Z
I(X,Y,Z) = Y XOR (X OR NOT(Z))
First A,B,C,D are set to fixed constants as shown below in Hex (base-16).
A = 0x67452301 B= 0xEFCDAB89 C= 0x98BADCFE D= 0x10325476
We also create a 4 by 16 array of the following numbers and call it r:
r[ 0..15] = {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22}
r[16..31] = {5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20}
r[32..47] = {4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23}
r[48..63] = {6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21}
Note that r is random so it has to be written out in full, there is a second array k which can be calulated as follows: k[i] := floor(abs(sin(i + 1)) × 2^32) i being between 0 and 63 (again its for the 64 operations rember computers count from 0 :)) Next each 512-bit chuck is further broken down into 16 32-bit chunks. Next 64 total operations are performed (4 rounds X 16 32-bit chunks). for the first round f = (b and c) or ((not b) and d) we then set g to number of the operation (1-64) for each of the 64 total operation we do temp storage = d d = c c = b b = a + f + k[operation nummber] + g we then rotate b r[operation number] bits to the left and add that to the original value of b and store it in b a = temp storage of d this is repeated for all 16 chuncks. Next round 2 is performed 16 times once on each chunck as follows: f = (d and b) or ((not d) and c) g = (5×operation_number + 1) modulus 16 again we perform the same operation as we did at the end of each of the 16 chunks in round 1 (temp storage = d,d = c...) Next for round 3 we do: f = b xor c xor d g = (3×operation_number + 5) modulus 16 and again after each operation we do the whole shift stage. Round 4: f = c xor (b or (not d)) g = (7×operation_number) modulus 16 yet again we do the shift. each time we do an operation we add its results to the sums of the previous operations (we modulus everything by 2^32) we then append a to b to c to d and there by get our MD5 hash value.
Sample code
(thanks to wikipedia)
//Note: All variables are unsigned 32 bits and wrap modulo 2^32 when calculating
//Define r as the following
var int[64] r, k
r[ 0..15] := {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22}
r[16..31] := {5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20}
r[32..47] := {4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23}
r[48..63] := {6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21}
//Use binary integer part of the sines of integers as constants:
for i from 0 to 63
k[i] := floor(abs(sin(i + 1)) × 2^32)
//Initialize variables:
var int h0 := 0x67452301
var int h1 := 0xEFCDAB89
var int h2 := 0x98BADCFE
var int h3 := 0x10325476
//Pre-processing:
append "1" bit to message
append "0" bits until message length in bits ≡ 448 (mod 512)
append bit length of message as 64-bit little-endian integer to message
//Process the message in successive 512-bit chunks:
for each 512-bit chunk of message
break chunk into sixteen 32-bit little-endian words w(i), 0 ≤ i ≤ 15
//Initialize hash value for this chunk:
var int a := h0
var int b := h1
var int c := h2
var int d := h3
//Main loop:
for i from 0 to 63
if 0 ≤ i ≤ 15 then
f := (b and c) or ((not b) and d)
g := i
else if 16 ≤ i ≤ 31
f := (d and b) or ((not d) and c)
g := (5×i + 1) mod 16
else if 32 ≤ i ≤ 47
f := b xor c xor d
g := (3×i + 5) mod 16
else if 48 ≤ i ≤ 63
f := c xor (b or (not d))
g := (7×i) mod 16
temp := d
d := c
c := b
b := ((a + f + k[i] + w(g)) leftrotate r[i]) + b
a := temp
//Add this chunk's hash to result so far:
h0 := h0 + a
h1 := h1 + b
h2 := h2 + c
h3 := h3 + d
var int digest := h0 append h1 append h2 append h3 //(expressed as little-endian)
Cracking MD5
Methods
The main weakness is the it's length. With only 2^128 possible combinations of hashes it becomes weak when it repersents several 100 megabyte files. This is because for every 1 hash there are file_size divided by 128 number of possible files that could repersent the same hash. It is thereby crackable based on the Birthday_Paradox. The idea is that if you were to keep trying to hash values, eventually you would get a collision where 2 seemingly different inputs give you the same hash. This is also true because no matter the size of the input, the algarithum treats it as 1 512 bit input. Example of collision: Suppose the attacker (Charlie) discovers that the message "I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005." has the same hash as "I, Bob, agree to pay Charlie $18542841.54 on 9/27/2012." Charlie could then try to get Bob (the victim) to digitally sign the first message (e.g., by purchasing $5000 of goods). Charlie would then claim that Bob actually signed the second message, and "prove" this assertion by showing that Bob's signature matches the second message.
From the Web
- Milw0rn.com: Unknow database size.
- GData Online: Huge database and average hash lookup of 0.04 seconds.
- Fast MD5 Collision Generator


