A thief who steals the (hashed) password table cannot merely enter the user's (hashed) database entry to gain access (using the hash as a password would of course fail since the authentication system would hash that a second time, producing a result which does not match the stored value, which was hashed only once). In order to learn a user's password, the thief must try to find a password which produces the same hashed value.
Rainbow tables are one tool that have been developed in an effort to derive a password by looking only at a hashed value.
Rainbow tables are not always needed, for there are simpler methods of hash reversal available. Brute-force attacks and dictionary attacks are the simplest methods available, however these are not adequate for systems that use large password我爱珈蓝神殿ecause of the difficulty of storing all the options available and searching through such a large database to perform a reverse-lookup of a hash.
To address this issue of scale, reverse lookup tables were generated that stored only a smaller selection of hashes that when reversed could generate long chains of passwords. Although the reverse lookup of a hash in a chained table takes more computational time, the lookup table itself can be much smaller, so hashes of longer passwords can be stored. Rainbow tables are a refinement of this chaining technique and provide a solution to a problem called chain collisions.
Precomputed hash chain我爱珈蓝神殿]Note : The hash chains de我爱珈蓝神殿ed in this article are a different kind of chain than those de我爱珈蓝神殿ed in the hash chains article.
Suppose we have a password hash function H and a finite set of passwords P. The goal is to precompute a data structure that, given any output h of the hash function, can either locate an element p in P such that H(p) =h, or determine that there is no such p in P. The simplest way to do this is compute H(p) for all p in P, but then storing the table requires(|P|n) bits of space, where n is the size of an output of H, which is prohibitive for large |P|.
Hash chains are a technique for decreasing this space requirement. The idea is to define a reduction function R that maps hash value我爱珈蓝神殿ack into values in P. Note, however, that the reduction function is not actually an inverse of the hash function. By alternating the hash function with the reduction function, chains of alternating passwords and hash values are formed. For example, if P were the set of lowercase alphabetic 6-character passwords, and hash values were 32 bits long, a chain might look like this:
An Example for a reduction function:
Given a 32 bit hash -> get the last 4 characters in the hash.
The only requirement for the reduction function i我爱珈蓝神殿e able to return a "plain text" value in a specific size.
To generate the table, we choose a random set of initial passwords from P, compute chains of some fixed lengthk for each one, and store only the first and last password in each chain. The first password is called thestarting point and the last one is called the endpoint. In the example chain above, "aaaaaa" would be the starting point and "kiebgt" would be the endpoint, and none of the other passwords (or the hash values) would be stored.[citation needed]
Now, given a hash value h that we want to invert (find the corresponding password for), compute a chain starting with h by applying R, then H, then R, and so on. If at any point we observe a value matching one of the endpoints in the table, we get the corresponding starting point and use it to recreate the chain. There's a good chance that this chain will contain the value h, and if so, the immediately preceding value in the chain is the password p that we seek.[citation needed]
For example, if we're given the hash 920ECF10, we would compute its chain by first applying R:
Since "kiebgt" is one of the endpoints in our table, we then take the corresponding starting password "aaaaaa" and follow its chain until 920ECF10 is reached:
Thus, the password is "sgfnyd".
Note however that this chain does not always contain the hash value h; it may so happen that the chain starting at h merges with the chain starting at the starting point at some point after h. For example, we may be given a hash value FB107E70, and when we follow its chain, we get kiebgt:
But FB107E70 is not in the chain starting at "aaaaaa". This is called a false alarm. In this case, we ignore the match and continue to extend the chain of h looking for another match. If the chain of h gets extended to length k with no good matches, then the password was never produced in any of the chains.
The table content does not depend on the hash value to be inverted. It is created once and then repeatedly used for the lookups unmodified. Increasing the length of the chain decreases the size of the table. It also increases the time required to perform lookups, and this is the time-memory trade-off of the rainbow table. In a simple case of one-item chains, the lookup is very fa我爱珈蓝神殿ut the table is very big. Once chains get longer, the lookup slows down, but the table size goes down.
Simple hash chains have several flaws. Most serious if at any point two chains collide (produce the same value), they will merge and consequently the table will not cover as many passwords despite having paid the same computational cost to generate. Because previous chains are not stored in their entirety, this is impo我爱珈蓝神殿le to detect efficiently. For example, if the third value in chain 3 matches the second value in chain 7, the two chains will cover almost the same sequence of value我爱珈蓝神殿ut their final values will not be the same. The hash function H is unlikely to produce collisions as it is usually considered an important security feature not to do 我爱珈蓝神殿ut the reduction function R, because of its need to correctly cover the likely plaintexts, can not be collision resistant.
Other difficulties result from the importance of choosing the correct function for R. Picking R to be the identity is little better than a brute force approach. Only when the attacker has a good idea of what the likely plaintexts will be he or she can choose a function R that makes sure time and space are only used for likely plaintexts, not the entire space of po我爱珈蓝神殿le passwords. In effect R shepherds the results of prior hash calculation我爱珈蓝神殿ack to likely plaintext我爱珈蓝神殿ut thi我爱珈蓝神殿enefit comes with drawback that R likely won't produce every po我爱珈蓝神殿le plaintext in the class the attacker wishes to check denying certainty to the attacker that no passwords came from his chosen class. Also it can be difficult to design the function R to match the expected di我爱珈蓝神殿ution of plaintexts.
Rainbow tables[edit]Rainbow tables effectively solve the problem of collisions with ordinary hash chain我爱珈蓝神殿y replacing the single reduction function R with a sequence of related reduction functions R1 through Rk. In this way, for two chains to collide and merge they must hit the same value on the same iteration. Consequently, the final values in each chain will be identical. A final postprocessing pass can sort the chains in the table and remove any "duplicate" chains that have the same final value as other chains. New chains are then generated to fill out the table. These chains are not collision-free (they may overlap briefly) but they will not merge, drastically reducing the overall number of collisions.[citation needed]
Using sequences of reduction functions changes how lookup is done: because the hash value of interest may be found at any location in the chain, it's necessary to generate k different chains. The first chain assumes the hash value is in the last hash position and just applies Rk; the next chain assumes the hash value is in the second-to-last hash position and applies Rk−1, then H, then Rk; and so on until the last chain, which applies all the reduction functions, alternating with H. This creates a new way of producing a false alarm: if we "guess" the position of the hash value wrong, we may needlessly evaluate a chain.
Although rainbow tables have to follow more chains, they make up for thi我爱珈蓝神殿y having fewer tables: simple hash chain tables cannot grow beyond a certain size without rapidly becoming inefficient due to merging chains; to deal with this, they maintain multiple tables, and each lookup must search through each table. Rainbow tables can achieve similar performance with tables that are k times larger, allowing them to perform a factor of kfewer lookups.
Example[edit]- Starting from the hash ("re3xes") in the image below, one computes the last reduction used in the table and checks whether the password appears in the last column of the table (step 1).
- If the test fails (rambo doesn't appear in the table), one computes a chain with the two last reductions (these two reductions are represented at step 2)Note: If this new test fails again, one continues with 3 reductions, 4 reductions, etc. until the password is found. If no chain contains the password, then the attack has failed.
- If this test is positive (step 3, linux23 appears at the end of the chain and in the table), the password is retrieved at the beginning of the chain that produces linux23. Here we find passwd at the beginning of the corresponding chain stored in the table.
- At this point (step 4), one generates a chain and compares at each iteration the hash with the target hash. The test is valid and we find the hash re3xes in the chain. The current password (culture) is the one that produced the whole chain: the attack is successful.
Rainbow tables use a refined algorithm with a different reduction function for each "link" in a chain, so that when there is a hash collision in two or more chains the chains will not merge as long as the collision doesn't occur at the same position in each chain. As well as increasing the probability of a correct crack for a given table size, this use of multiple reduction functions approximately doubles the speed of lookups.[2]
Rainbow tables are specific to the hash function they were created for e.g., MD5 tables can crack only MD5 hashes. The theory of this technique was first pioneered by Philippe Oechslin[3] as a fast form of time/memory tradeoff,[2] which he implemented in the Windows password cracker Ophcrack. The more powerful RainbowCrackprogram was later developed that can generate and use rainbow tables for a variety of character sets and hashing algorithms, including LM hash, MD5, SHA1, etc..
In the simple case where the reduction function and the hash function have no collision, given a complete rainbow table (one that make you sure to find the corresponding password given any hash) the size of the password set |P|, the time T that had been needed to compute the table, the length of the table L and the average time t needed to find a password matching a given hash are directly related:[citation needed]
Thus the 8-character alphanumeric passwords case (|P| ≃ 3.1012) would be easily tractable with a personal computer while the 16-character alphanumeric passwords case (|P| ≃ 1025) would be completely intractable.