The U.S. government is beginning to post its vast collection of data sets online. At the moment, there are only 47 data sets posted at data.gov and most of these are geological or weather related. However, it won’t be long (I’m told) before data of greater interest to historians begin to appear. I, for one, can’t wait for census data to begin showing up on this site rather than having to rely on other, more cumbersome points of access to the census. My hope is that data.gov will eventually include not only the 2000 census, but all of the census data collected by the federal government. Talk about a treasure trove for historians!
Around the world census authorities are posting more and more raw data in its entirety and in various summary forms. At present much of this data is the most recent information, but soon we can expect to see historical data sets. One thing I like about the data.gov data sets is that many are published in a variety of formats, including, for instance, Google Earth overlays. So, for example, if you want to know how many earthquakes there have been in any part of the world in the past seven days, you can download the file and take a peek. Here’s a look at Alaska.
But what if your interest was in changing patterns of infant mortality in Europe compared to levels of industrialization (say, steel production) over time? Once these data are available, enterprising historians and geographers and sociologists and economists will start to play with the data and instead of earthquakes, we’ll be able to see graphical representations of the relationship between things like mortality and industrialization. Of course, this will require some rather unprecedented cooperation between social scientists who aren’t so used to talking to one another, but I suspect that a passion for data is something many of us share and will become a way to bridge our disciplinary divides.
Heartily agree – check out NHGIS – data.gov needs to become directory for all public datasets and portal for semantically linked and machine readable data