Tools for Data-Mining

The Sunlight Foundation has been tracking how fast the federal government's releasing data sets to the public. They say it's a record pace now, and outlines some of the problems I've raised here before:

As of today, about halfway through the first quarter, government is already on pace to beat its Q3 2009 record of 308 datasets. Since June of last year, Government has been releasing data at a pace of 4 datasets per day....A new problem is starting to arise-- classifying and organizing this information. Much of this data, for instance, may very well be unuseful to most people. And the rate of esoteric data that agencies push out will make it more difficult to find the proverbial diamonds in the rough.

Fortunately, they're testing out a tool to make sorting through all those sets easier. If you give it a try, let me know how it works. I've long thought open-source development is going to be the key to making data dumps useful. Tools like this that make it easier for curious folks to get that work started are extremely important.