Linked Data and Statistics
Semantic web seems to be dominating my thinking at work at the moment. Paul Richards and I attended the SDMX Expert Group in Geneva last week, and we presented the outcome of the Sunningdale Workshop we organised on Publishing Statistics in SDMX and the Semantic Web to the SDMX community. We received some very positive comments, and there is real recognition that the community should take advantage of semantic web developments (and has valuable experience and expertise to contribute).
I used Prezi again – which is open and shareable. Is this better than Powerpoint, or more distracting?
Part of the purpose of the presentation was to boost awareness of the community that has grown, and continues to develop the ideas formed in Sunningdale. We’re already up to 60 members, and still growing. We seem to have made real progress, but have yet to prove that there is a good reason to break apart the observations that make up a dataset into countless rdf triples. I see good reasons to expose dimensions and other dataset metadata as linked data, to make it findable and linkable.
But why disassemble the actual dataset?
This seems a bit like separating all the words of a document so that they can be reassembled by a client application. The effort to make sure that the resulting document is the same as the original is substantial, and of doubtful value. I think we are close to showing that it can be done with datasets, though perhaps not very safely. I’m not sure we’ve yet shown why we should do it, but I’m happy to go along with a number of experiments that might shed some light on the matter.

Leave a Reply