Wednesday 1 August 2007

Computing MD5 digest (checksum) in Java

MD5 digests are useful to track file (or, in fact, data) modifications. It is very unlikely that two files will have the same MD5 digests. It is extremely unlikely that a minor modification to a file will preserve its MD5 digest.

I use MD5 digests to decide whether a precomputed result is still valid. For example, you simulate a system of differential equations defined in file "system.xml", and you save the result to a file "result.xml". But what if the user modifies the original system of equations? The computed result will not be correct anymore. The solution is to store the MD5 digest of the original system of equations with your result. When result is selected for display, you check whether the digest of your current system of equations matches the one stored with the result, and if they don't match, the result is not valid.

There is a number of ways to compute MD5 digests. You can find hundreds of implementations on the web. I stick to the one which comes with standard Java Runtime Environment distribution from Sun Microsystems. Below is my function which returns a String representation of MD5 digest for an arbitrary file:

public String checksum(File file) {
try {
InputStream fin = new FileInputStream(file);
java.security.MessageDigest md5er =
MessageDigest.getInstance("MD5");

byte[] buffer = new byte[1024];
int read;
do {
read = fin.read(buffer);
if (read > 0)
md5er.update(buffer, 0, read);
} while (read != -1);
fin.close();
byte[] digest = md5er.digest();
if (digest == null)
return null;
String strDigest = "0x";
for (int i = 0; i < digest.length; i++) {
strDigest += Integer.toString((digest[i] & 0xff)
+ 0x100, 16).substring(1).toUpperCase();
}
return strDigest;
} catch (Exception e) {
return null;
}
}

And if you are using Eclipse RCP and want to compute MD5 digest for an IFile object, here you are:

public String checksum(IFile file) {
try {
InputStream fin = file.getContents(true);
java.security.MessageDigest md5er =
MessageDigest.getInstance("MD5");

byte[] buffer = new byte[1024];
int read;
do {
read = fin.read(buffer);
if (read > 0)
md5er.update(buffer, 0, read);
} while (read != -1);
fin.close();
byte[] digest = md5er.digest();
if (digest == null)
return null;
String strDigest = "0x";
for (int i = 0; i < digest.length; i++) {
strDigest += Integer.toString((digest[i] & 0xff)
+ 0x100, 16).substring(1).toUpperCase();
}
return strDigest;
} catch (Exception e) {
return null;
}
}

7 comments:

Anonymous said...

This is the best example I found to checksum a file using java. There are tons of examples out there but they are all overly complicated. This one is elegant and work the first time without requiring any modification.

Nice work.

Unknown said...

Great and simple example. Just what I was looking for. Thanks

Anonymous said...

Excellent, a simple way of doing it.

Unknown said...

Thank you so much.

Unknown said...

Thanks a lot ... worked as it is...

Anonymous said...

Brilliant, thanks.

Anonymous said...

Yep, great example.
Thanx a lot