Because MD5 makes only one pass over the data, if two prefixes with the same hash can be constructed, a common suffix can be added to both to make the collision more reasonable.
Because the current collision-finding techniques allow the preceding hash state to be specified arbitrarily, a collision can be found for any desired prefix; that is, for any given string of characters X, two colliding files can be determined which both begin with X.
All that is required to generate two colliding files is a template file, with a 128-byte block of data aligned on a 64-byte boundary, that can be changed freely by the collision-finding algorithm.
Recently, a number of projects have created MD5 "
rainbow tables" which are easily accessible online, and can be used to reverse many MD5 hashes into strings that collide with the original input, usually for the purposes of
password cracking. However, if passwords are combined with a
salt before the MD5 digest is generated, rainbow tables become much less useful.
The use of MD5 in some websites'
URLs means that
Google can also sometimes function as a limited tool for reverse lookup of MD5 hashes.
[12] This technique is rendered ineffective by the use of a
salt.
<snip>
In
cryptography, a
salt comprises random
bits that are used as one of the inputs to a
key derivation function. The other input is usually a
password or
passphrase. The output of the key derivation function is stored as the encrypted version of the password. A salt can also be used as a
key in a
cipher or other
cryptographic algorithm. The key derivation function typically uses a
hash function. Sometimes the
initialization vector, a previously-generated value, is used as a salt.
Salt data complicates
dictionary attacks that use pre-encryption of dictionary entries: Each bit of salt used doubles the amount of storage and computation required.
(src: wikiepdia, yea i know..)