# Lecture 5: Cryptographic Hashes

## Presentation on theme: "Lecture 5: Cryptographic Hashes"— Presentation transcript:

Lecture 5: Cryptographic Hashes
Outline definition properties uses authentication encryption (stream cipher) integrity protections passwords hash example: MD2 other hash algorithms

Definition and Properties
cryptographic hash (message digest) – a function that maps an arbitrary length input into a fixed output (called hash or digest) hash properties one-way – computationally infeasible to find the input for a particular hash value pseudorandom – intruder should not be able to deduce information about the input out of the hash collision resistant – cannot find two inputs that generate the same hash

Pseudorandomness in Detail
Each hash value seen in practice should have about 1/2 the bits on Changing one bit out input should change about 1/2 the bits (unpredictable which) Two outputs should be uncorrelated, regardless of how closely related the inputs any subset of the bits should be a good hash

Collision Resistance in Detail
Birthday Problem (“paradox”): When √N elements or more are chosen randomly from a domain of N, the probability of collision is above 50% how many people do you need to get so that at least one pair shares a birthday? why is collision resistance necessary? if intruder is able to pick text to match his task is simplified due to birthday paradox with probability more than 50%? more than 23, the answer is computed by inverting the problem – what’s the probability of people not sharing a birthday – total possibilities 356*356 first person picks b-day 354, second person picks 354, etc. so for n people it will be 356!/(356^N * (356-N)!) for N>23 the inverse comes up to greater than 50%

Hash Uses Sign hash (digest) instead of message
Store digests of files, to look for changes (e.g., viruses). (Tripwire does this) Why wouldn’t CRC work? With secret, can do anything a secret key algorithm can do (authenticate, encrypt, integrity-protect) irreversible password hash database why must be irreversible?

Authentication with Hash
how was authentication with secret key cryptography done? both know secret K Alice Bob I’m Alice R hash(R||K)

Stream Cipher with Hash
Create pad. First send IV in clear b1=hash(K || IV) b2=hash(K || b1) bi=hash(K || bi-1) Note, with IV, Alice can precompute pad, but Bob can’t can mix in plaintext for pad generation – lose pre-computation capability, gain (some) integrity protection b1=hash(K || IV) c1= c1  b1 b2=hash(K || c1) c2= c2  b2 bi=hash(K || ci-1) ci= ci  bi

Integrity Protection with Hash
MAC(again) – message authentication code – used to protect the integrity of a message can we just hash the message (without using key) to produce the MAC? approaches to hash-based MAC prefix: MACK(x) = H(K || x) not secure; extension attack: the hashes are usually computed by repeatedly hashing blocks and combining with previously computed value intruder can append to the message without knowing key suffix: MACK(x) = H(x || K) mostly ok; problematic if H is not collision resistant: two messages with the same hash will have the same MAC, why? envelope: MACK(x) = H(K1 || x || K2) HMAC: MACK(x) = H(K1 || H(x || K2)) provably secure; slower, popular in Internet standards. MAC – message authentication code, used to protect integrity of the message message hash will not work, because anyone can do it knowing the hash algorithm two messages with the same hash will have the same MAC, why? – because the key is just appended to the message – se the argument for the extension attack

Unix Password Hash used only one way for authentication
DES-like, plain DES is not used to prevent hardware-based DES encoders from being used in password guessing password converted to a DES – key first 8 7-bit ASCII characters of the password used to create 56-bit key used to encrypt the number 0 problem: same passwords hash to the same value (dictionary attack possible) solution: use salt an arbitrary 12-bit value salt controls what bits are duplicated in R at every DES round salt is appended to hash in the clear

how to deal with passwords longer than 8 characters could ignore all but 1st 8 chars done in old Unixes typical: store crypt(1st 8 bytes), crypt(2nd 8 bytes) what’s wrong with this? if the second half is short – can break it and try guessing the first half

MD2: outline takes an arbitrary message, operates on octets and produces a 128-bit (16-octet) digest steps input the message, break into octets, pad to a multiple of 16 octets compute a 16-octet checksum and append it to the message final pass: compute the digest these three steps can be done in one pass very limited memory requirements – can be done on resource constrained machine

MD2: Checksum Calculation
checksum is an intermediate 16-octet value appended to the message for before final digest calculation checksum is computed one padded message octet at a time the current octet of the message is: XORed with previous octet of the checksum the result substituted according to fixed octet substitution table (-substitution) the result is XORed with current value of checksum and stored

MD2: Final Pass padded message with checksum is processed one 16-octet block at a time each time a 48-octet value is computed as: message digest || current message block || XOR of the two 18 passes over this value -1th bit contains sum of 47th octet + pass number each pass – current octet XORed with a -substitution of the previous octet after 18 passes, the first 16 octets are used as MD for the next 16-octet block of the message

History of Hash Algorithms
MD – proprietary, never published, not widely used MD2 – first public algorithm, oriented towards 8-bit processing, little memory, good for embedded devices MD3 – immediately superceded by MD4 (never published) MD4 – runs faster than MD2, uses 32-bit operations, became suspect MD5 – slightly slower, more conservative SHA-1 – NIST standard, similar to MD5 even more conservative eventually MD2 and MD4 are “broken” – two messages with the same hash are found MDs produce 128-bit digests, SHA-1 – 160-bit digest if the second half is short – can break it and try guessing the first half