# CT30A8800 Secured communications

Hash and MAC

## Oneway hash functions

• Generates from undetermined size message M, a fixed size hash value h.
• $h=H(M)$
• Properties
• It is easy to calculate h from M
• It is practicly impossible to calculate M from h.
• Collision resistance
• It is hard to find M' so that $H(M)=H(M')$
• It is hard to find such M and M' that $H(M)=H(M')$

## Birthday attack

• How many persons is needed in the room in order to have over 50% chance that someone in room has same birthday than you do? A: 253
• How many persons has to be in the room so that it is over 50% chance that two persons have same birthday? 23
• 23 persons form 253 different kind of pairs (1+2+3+…+22 = 253)
• for m-bit hash $2^{m}$generated messages is required in order to find out message that has same hash than message M.
• Computer that can calculate 1 million hashes in second, will take average of 600 000 years to find a message which has same 64-bit hash than M.
• For m-nbit hash $2^{m/2}$ generated messages are required in order to find two different messages that produce same hash value.
• Computer that can calculate 1 million hashes in second, will take average of one hour in order to find two messages that has same 64-bit hash.
• When birthday attack is feared, hash size should be doubled. (usually is)

## Example attack

1. A creates two contract documents. One is beneficial to B and the other forces B to bankruptcy.
2. A makes small changes two the contracts e.g. adds spaces before line feed and then generates huge amount of different variations of these two documents.
3. A calculates hash values of these documents until she finds two documents that has same hash. A stores these two documents.
1. From these documents A sends the document beneficial to B, for signing. (the protocol where hash value is signed is used.)
2. Later A switches the documents and claims in court that B signed the document, which drives B to bankrupt.
• Attack can be prevented by making small change (e.g. adding space) in the document before calculating hash and signing it.

## General model for calculating hash

• Divide message in blocks
• Feed the block to hash function along with the hash value of previous block
$h_{i}=f(M_{i},h_{i-1})$
• The result from last block is the hash.

## Making longer hash values

• If hash function provides too short hash, it can be enlarged and thus made safer
1. Generate hash value with hash function.
2. Concatenate hash value in the begining of the message
3. Generate new hash value from the new messages
4. Concatenate the hash values
5. Run through until you have the required size hash value.

## MD5

• Developed by Ron Rivest 1992 and is based on MD4. (RFC 1321)
• Returns 128-bit hash value from message.
• The data has to be divided in 512 bit blocks which are run through the hash function one after another.
• The message is padded so that it's length n is $n\: mod\:512=448$
• padding is done by adding one and after that required amount of zeroes.
• At the end, 64-bit value is added that states the length of message. (448+64 = 512)
• data to be hashed is thus k*512 bits, where k is integer.
• Initialisation.
• Initialise four 32.bit chaining variables (CV) A,B,C,D

$A=0x01234567$

$B=0x89ABCDEF$

$C=0xFEDCBA98$

$D=0x76543210$
• MD5 operation
• Compression function (512 bits in, 128 bits out)
• A,B,C,D are copied in variables a,b,c,d
• The incoming 512-bit block is divided in 16 sub blocks that 32-bits long
• The main loop consists of four rounds(F,G,H,I).
• For every round following function is run 16 times.
$a\leftarrow b+((a+g(b,c,d)+M_{i}+T_{i})<<
• a,b,c,d are variables a,b,c,d which switch places in every round (4 values, 16 different combination, 16 rounds)
• i is current round
• $T_{i}$ is integer part from function $2^{32}*abs(\sin(i))$ , where i is in radian
• $M_{i}$ is the i:th 32-bit subblock of message
• s is the amount of shifts that is dependant from current round (1-16) and mainloop rounds (1-4)

$s[0..15]=[7,12,17,22,7,12,17,22,7,12,17,22,7,12,17,22]$

$s[16..31]=[5,9,14,20,5,9,14,20,5,9,14,20,5,9,14,20]$

$s[32..47]=[4,11,16,23,4,11,16,23,4,11,16,23,4,11,16,23]$

$s[48..63]=[6,10,15,21,6,10,15,21,6,10,15,21,6,10,15,21]$
• Function g is round dependant.
$F(X,Y,Z)=(X\wedge Y)\vee((\neg X)\wedge Z)$

$G(X,Y,Z)=(X\wedge Z)\vee(Y\wedge(\neg Z))$

$H(X,Y,Z)=X\oplus Y\oplus Z$

$I(X,Y,Z)=Y\oplus(X\vee(\neg Z))$
• Analysis of functions F,G and H:
• F is if function (if b, then c else d) (b?c:d)
• G is also if function (d?b:c)
• H provides parity bit
• After 4 rounds a,b,c,d are added to A,B,C and D individually using modulus $2^{32}$
• $A=(A+a)\: mod\:2^{32}$ , $B=(B+b)\: mod\:2^{32}$ , $C=(C+c)\: mod\:2^{32}$ , $D=(D+d)\: mod\:2^{32}$
• Take next 512-bit block for handling.
• The final hash is A,B,C,D concatenated i.e. 4*32bit = 128-bit hash

## MD5 properties

• Each step has their own constant
• Each step affects to next step in order to cause more efficient avalanche effect i.e. the effect of one bit is higher to the resulting hash.
• Change of one bit in the original text will change the whole hash, not just one bit.
• Due, going through the basic operations several times, it is highly unlikely that two random messages will produce same hash value.
• The use of multiple basic operations makes it unlikely that two random messages will provide same hash value.
• Some weakness have been found from MD5 compression algorithm which can cause different types of collisions.
• On crypto 2004 conference, a research was published which showed some weaknesses in MD5. (http://www.schneier.com/essay-074.html)
• Same problem applies to SHA-0 and RIPEMD160

## SHA-1

• SHA-1 is and algorithm developed by NSA for DSS (Digital Signature Standard) and is thus the algorithm of SHS (Secure Hash Standard)
• SHA-1 is based on MD4 algorithm (as was MD5) developed by Ron Rivest at 1990.
• Maximum message length for SHA-1 is $2^{64}$ bits from where SHA-1 provides 160-bit hash value
• SHA-1 vs MD5
• SHA-1 uses five vectors where MD5 used 4 (128+32 = 160)
• SHA-1 has 4*20= 80 rounds while MD5 had 4*16=64 rounds
• SHA-1 bit functions are slightly different

## Security of SHA-1

• Compared to MD5
• The hash size of SHA-1 is 32 bits longer and is thus more resistant for brute-force attacks
• SHA-1 is slightly slower and requires bit more memory(more rounds, longer vectors to get 160 bit hash)
• SHA-1 uses input bits more often in during the hash function than MD5
• On February 15th, 2005 Bruce Schneier stated on his blog:
“SHA-1 has been broken. Not a reduced-round version. Not a simplified version. The real thing. The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu (mostly from Shandong University in China) have been quietly circulating a paper announcing their results:
• collisions in the the full SHA-1 in $2^{69}$ hash operations, much less than the brute-force attack of $2^{80}$ operations based on the hash length.
• collisions in SHA-0 in $2^{39}$ operations.
• collisions in 58-round SHA-1 in $2^33$ operations.”
• Currently all MD5 style Hash functions are susceptible.
• Competition for generating new hash standard will be opened by NIST

## Whirlpool

• Block cipher based hash function (http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
• Uses block cipher W, which is based on AES.
• W is specifically designed to be used for hash function
• W is specifically designed for use with hash functions
• Follows The Miyaguchi-Preneel one-way compression function principle
• One way compression function is a function which takes two fixed length inputs to provide single output which size is same as one of the inputs. Process is irreversible like for hash functions.
• The next block is encrypted by using the result of encryption of previous block XORed with key of previous block and plain message
$H_{i}=E_{H_{i-1}}(m_{i})\oplus H_{i-1}\oplus m_{i}$
• Provides 512 bit hash value
• Block size of W is 512 bits (AES has 128 bit block size)
• Has been shown to be resistant on known vulnerabilities of block cipher based hash functions
• Block ciphers do not provide proper randomness as default (They are designed to be invertible), which may result in weaknesses
• Block ciphers are slow compared to hash functions
• W has been designed to be fast
• block cipher based hash function provide hash that is size of the block
• Normal block size 64 or 128 bits, W block size 512 bits
• Endorsed by NESSIE (New European Schemes for Signatures, Integrity and Encryption)

## MAC

• Message Authentication code, cryptographic checksum.
• Hash value that is dependant on secret key.
$MAC=C_{K}(M)$
• Use of MAC
• The integrity of the data can be confirmed with MAC
• Only those who know what the password is can count MAC. therefore MAC can be saved along data at the same public place.
• Communication packet integrity can be verified by calculating MAC from the packet and it's sequence number (IPSec)
• MAC is not signature.
• Anyone who knows the key can calculate MAC
• When used key in MAC is published then the MAC value is equivalent to hash

## MAC vs symmetric encryption

• Both can be used to confirm the integrity of data.
• With MAC the information can be read without decrypting it or without knowing password.
• Authenticity of data can be validated on need basis
• MAC can confirm the data authenticity longer than symmetric encryption
• In symmetric encryption the key has to be known by those who are allowed to see the data
• MAC can be saved even when the data is used
• MAC key is stronger against brute force, when key k > than the size of MAC n
• For each message there exists $2^{(k-n)}$ keys that provide same MAC value.

## CBC-MAC

• Easiest way to produce MAC is encrypting hash value with symmetric algorithm $E_{k}(H(M))$
• A message is encrypted by using block cipher in CBC mode.
• After the message is encrypted, the result coming out from the encryption of final block is the MAC value.
• The algorithm can be strengthened by encrypting the final value with another key

## HMAC

• Creating MAC by adding cryptographic feature in hash function
• Hash functions are faster than block ciphers
• design principles
• Uses existing hash functions
• Hash function should be easily replaceable to another one, when necessary
• hash function performance should be preserved
• simple use and handle of key
• Provable security, independent from hash function properties.

## HMAC Algorithm

1. Key K is expanded by adding zeroes on left to get $K^{+}$ , which size is b bits. b is the size of the hash block (e.g. the amount of bits the hash takes in.)
2. XOR $K^{+}$ with ipad (ipad is $00110110$ repeated $b/8$ times). The result is b bit long block $S_{i}$
3. Concatenate M with $S_{i}>$ ($S_{i}$ in the beginning)
4. Run the result from step three through hash function H.
5. XOR $K^{+}$ with opad (opad is $01011010$ repeated $b/8$ times). The result is b-bit long $S_{o}$
6. pad the result from step 4 to be b bits and concatenate with $S_{o}$ ($S_{o}$ in the beginning)
7. Run the result from step 6 through hash function to get MAC value
• use of ipad and opad generates two different keys from original key K.
• Due the hash function both keys affects randomly to the result
• ipad and opad both flip half of the bits of K, thus the generated keys are pseudorandomly generated.
• With long messages the algorithm is just slightly slower than the used hash function
• Three extra executions of hash function is needed (for $S_{i}$ on the step 4, $S_{o}$ on step 7 and for the hash itself on step 7)

## Faster version of HMAC

• $S_{i}$ and $S_{o}$ are precomputed and run through hash compression function.
• the compressed$S_{i}$ is used as an IV for the Hash function
• The result from hash is padded to b bits and run through the compression function of hash using compressed $S_{o}$ as an IV
• Only one more instance of compression function is executed compared to the normal hash.
• $S_{i}$ and $S_{o}$compressions have to only be calculated when key is changed.
• It can be proved that security is directly dependant on the security of hash function.