Producing salted Password Hashes using the MD5 Algorithm

As an exercise in password storage i implemented a PHP function that takes a UTF-8 encoded Unicode cleartext, generates a random salt and produces a password hash using the MD5 algorithm.

Please note that this is just an exercise and should not be used in production.

  • This code has not been properly tested and has never been reviewed. I think it’s OK for educational purposes (otherwise i would not have published it), but it is very likely to have bugs.
  • There are computational attacks on MD5 that break the algorithms fundamental promises, they can be performed without extraordinary effort. That makes MD5 unusable for security-related purposes in the general case.
  • For PHP version 5.5 and later, you can use the password hashing API instead.

At any rate, it can be helpful to understand what the use case is and what „password hashing“ actually can accomplish. Read on for further observations.

Demo and Source Code

Try out the generator here and read the function source code here and here.

Basics of Operation

For a given cleartext, an MD5 salted hash provides the following data:

  • a checksum, a sequence of 128 Bits and
  • a „salt“, a sequence of 48 Bits.

Checksum and salt are stored as a single record of information.

Use Case

Checksum and salt can be used to determine if a specific input is identical to an original cleartext. A notable example of such a task is the check if a password is correct.

In case an unauthorized party establishes access to the storage of salts and hashes, the original cleartext is not immediately revealed to that party.

An unauthorized party accessing checksum and salt can not use a „rainbow table“, a database where lots of possible cleartexts have been precalculated to their checksums, to determine which cleartext is specified by the checksum; instead, the result of the checksum is obfuscated to one of 2 to the power of 48 Bits possible „salted“ results. This makes the operation of inverse lookup of the original cleartext more difficult for the unauthorized party.

Program Code Organization

My implementation breaks the problem into these parts:

  • cryptMD5(): Acquiring a random salt and applying it to the given cleartext using the crypt() library call. Here, an important decision is made: Where to get suitably random bytes from, so that the salt is virtually unpredictable. In my implementation i use the PHP mcrypt extension call mcyrpt_create_iv() and let it read from the /dev/urandom PRNG.
  • encodeSalt(): Transforming the salt’s random data to an alphabet suitable as an argument to the crypt() library call. This is essentially the Radix encoding called „B64“.