Appearance
HMAC
Using hash algorithms, we can verify the validity of a piece of data by comparing its hash value. For example, to check if a user’s password is correct, we compare the stored password_md5
in the database with the computed md5(password)
. If they match, the entered password is correct.
To prevent hackers from using rainbow tables to reverse-engineer the original password from the hash, we should not only compute the hash based on the original input but also include a salt. This ensures that the same input generates different hashes, significantly increasing the difficulty for hackers to crack the passwords.
When the salt is randomly generated, we typically calculate MD5 using md5(message + salt)
. However, we can think of the salt as a "key"—when hashing a message with the salt, we generate different hashes based on different keys. To verify the hash value, the correct key must be provided.
This is essentially what the HMAC (Keyed-Hashing for Message Authentication) algorithm does: it incorporates a key into the hash calculation process.
Unlike our custom salt method, HMAC is universal for all hash algorithms, whether MD5 or SHA-1. Using HMAC instead of our own salt method standardizes and secures the hashing process.
Python’s built-in hmac
module implements the standard HMAC algorithm. Here’s how to use HMAC to generate a keyed hash.
First, prepare the message, random key, and hashing algorithm (MD5 in this case):
python
import hmac
message = b'Hello, world!'
key = b'secret'
h = hmac.new(key, message, digestmod='MD5')
# If the message is long, you can call h.update(msg) multiple times.
print(h.hexdigest())
The output will be:
fa4ee7d173f2d97ee79022d1a7355bcf
Using HMAC is very similar to regular hash algorithms. The output length of HMAC matches that of the original hash algorithm. Note that both the key and message must be of bytes type; strings need to be encoded to bytes first.
Exercise
Replace the previous salt method with the standard HMAC algorithm to validate user passwords:
python
import hmac, random
def hmac_md5(key, s):
return hmac.new(key.encode('utf-8'), s.encode('utf-8'), 'MD5').hexdigest()
class User(object):
def __init__(self, username, password):
self.username = username
self.key = ''.join([chr(random.randint(48, 122)) for i in range(20)])
self.password = hmac_md5(self.key, password)
db = {
'michael': User('michael', '123456'),
'bob': User('bob', 'abc999'),
'alice': User('alice', 'alice2008')
}
def login(username, password):
user = db[username]
return user.password == hmac_md5(user.key, password)
# Tests:
assert login('michael', '123456')
assert login('bob', 'abc999')
assert login('alice', 'alice2008')
assert not login('michael', '1234567')
assert not login('bob', '123456')
assert not login('alice', 'Alice2008')
print('ok')
Summary
Python’s built-in hmac
module implements the standard HMAC algorithm, which calculates a hash of the message combined with a key. Using HMAC is more secure than standard hash algorithms because different keys will produce different hashes for the same message.