MD5 Hash Not as Secure as Previously Thought

After taking an advanced, graduate level theory of operating systems class this past semester, computer security has been a new interest of mine. As I get time, I try to research and fine more information about technologies used currently to help secure computers. One of the most startling results was that one of the most widely used security algorithms isn’t as secure as believed.

MD5 Hashing
With the internet being used by the majority of Americans and much of the rest of the world, computer security on the Internet is becoming more important. MD5 [http://en.wikipedia.org/wiki/MD5], a hashing algorithm that takes an arbitrarily long string and calculates a 32bit representation of that string, is one of the standards for making sure you are getting data from web sites that is legitimate and secured.

That statement was correct up until this past August, when researchers Wang, Feng, Lai, and Yu at three of China’s top engineering schools found a way to fool the MD5 hash [http://eprint.iacr.org/2004/199.pdf]. They were able to dynamically create two unique strings that have the same MD5 hash. The whole process of creating the second string takes about an hour for their computers to calculate. For more detail on how this is done, see the research of Hawkes, Paddon, and Rose, three engineers from the Qualcomm company in Australia that mathematically break down the proposed algorithms and demonstrate how a new string can be created that has the same hash as a previously known string [http://eprint.iacr.org/2004/264.pdf].

What does this mean
Many of the standards for data security on the internet are now able to be fooled with little challenge to a motivated party. In other words, a hacker could replace a program on a web site with another program of the same size and hash results. Using just MD5 to check the program, people wouldn’t be able to know the difference between the two programs until they run the new program and find that their computers now have a virus. This also means that secured transactions could be intercepted and replaced with new data that isn’t correct; many transaction protocols use MD5 to check that the data being sent is secured.

Proposed Solution
One possible solution to this problem is to use multiple hashing algorithms on the same file and compare all of the results. Though this is costly in processor time, it will ensure that bad data is found before it causes problems. Currently, many open-source distributors are using MD5 as well as SHA-1 or SHA-2 hash results to prove that their programs are legitimate and secured.


Tags:

 
 
 

Comments are closed.