Untangling the semantic web Southampton is pushing to be the go-to place for expertise on linked data in the UK, and researchers at its main university launched a site earlier this month containing no less than 21 "non-confidential" datasets that underline that semantic web desire.
The University of Southampton (UoS) is one of the first academic institutions in Blighty to follow in the footsteps of its neighbour – map-making agency the Ordnance Survey, which released some open datasets in April 2010. Indeed, some of the city's boffins are dead keen to put a linked data strategy for government, academic and public sector organisations on the map, if you'll pardon the pun. However, finding wider enthusiasm expressed in Southampton for Sir Tim Berners-Lee's relatively newborn and somewhat niche software modeling system remains a big challenge.
After all, there's a of lot baggage in the way of all that juicy data. Genuine enthusiasts for linked data, which was a term coined by Berners-Lee back in 2006, need to first brush aside web standards' arguments and political grandstanding among MPs apparently desperate to push a "transparency" agenda. On top of that, they also have to work around the big social content farms and closed data silos, like Facebook and chums, whose vision of an "open web" has massively undermined what is now considered an almost redundant term.
Mark Zuckerberg has famously declared that he "is trying to make the world a more open place" with his social network site Facebook. Recently, Berners-Lee publicly questioned Zuck's supposed claim.
"The sites [Facebook, LinkedIn, Friendster and others] assemble these bits of data into brilliant databases and reuse the information to provide value-added service – but only within their sites. Once you enter your data into one of these services, you cannot easily use them on another site. Each site is a silo, walled off from the others," he said in the December 2010 issue of Scientific American.
"Yes, your site's pages are on the web, but your data are not. You can access a web page about a list of people you have created in one site, but you cannot send that list, or items from it, to another site."
The likes of the UoS linked data team are fighting that silo effect by freeing up useful datasets to help the university better cope with its non-confidential student bureaucracy. Some would argue that their efforts should be applauded, even if questions remain about how such a model to make data even more accessible online might be eventually used to link big, unwieldy government datasets together in a truly meaningful way. Others might complain that trust and privacy could take an almighty blow online if such supposedly non-sensitive data, even if considered entirely vanilla and "out there in HTML form anyway", was so easily opened up on a grand scale to all comers.
"What we need is an information shaman," explains Christopher Gutteridge, a member of the technical staff at the UoS. Gutteridge has worked closely with big data researchers for a long time and launched the institution's linked data site with a small team earlier this month.
"Bring the data back for everyone and then it's useful, and you don't just bring it back in silos," he says.
But he acknowledges that the linked data model isn't for everyone.