Putting government data online
Following the appointment of Tim Berners-Lee as advisor to the government on how to make its data more widely available, it is interesting to see what has been happening in the US, and also to read some of Tim’s early thoughts on what we should be doing.
This is clearly not a lonely journey – the UK is following the US data.gov initiative, and Australia is following along the same path with an initiative similar to the Power of Information Task Force. It was a recommendation from the final report that has led to a focus on making data available.
For National Statistics, this raises some interesting questions. Should National Statistics be part of such an initiative, or might it damage the independence of the UK Statistics Authority? How can we take advantage of a raised focus on using data, but preserve the distinction between data that has been published under the code of practice, and other data that should be treated more cautiously? With all the excitement surrounding data mashups and the power of mixing data from multiple sources, it would be a pity if the quality dimension got left behind. Does data need a watermark so that sources can be traced, and conclusions drawn from mixed sources can be given the caution they deserve?
Common standards will be needed, but as ever, the problem with standards is that there is too much choice.
The international statistics community has settled clearly on SDMX – backed by the UN, OECD and Eurostat, and almost every leading National Statistics Institute is following it (including ONS). This could be an opportunity for it to become the predominant standard for published aggregate data across government, but it is not yet widely known beyond the statistical community. A prototype repository at ONS, serving up SDMX datasets is showing encouraging results, and we are already engaging with a wider community in the development of a suitable API.
Its wider adoption would encourage a wider community of developers, and it would be great if ONS and other NSIs could seed this with some further open source assets.