Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”.

Review and manage your consent

Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer.

Manage Cookie Preferences
  • These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect.

  • These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests.

  • These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance.

See also our Cookie policy and Privacy policy.

This article is more than 1 year old

No hack needed: Anonymisation beaten with a dash of SQL

Melbourne researchers warn government: don't publish data down to the individual, ever

Governments should not release anonymised data that refers to individuals, because re-identification is inevitable.

That's the conclusion from Melbourne University's Dr Chris Culnane, Dr Benjamin Rubinstein and Dr Vanessa Teague, who have shown that the Medicare data the Australian government briefly published last year can be re-identified – trivially.

The researchers demonstrated last year that the (hopefully deprecated) formula the government used to create "anonymous" identifiers for personal data was easily reversible.

The paper, here [PDF], examines the same data set that brought the wrath of Australia's sysadmin-in-chief, er, attorney general George Brandis, who proposed legislation (not yet passed) to criminalise unauthorised research into re-identification.

The researchers explained that there are simply too many facts in a data release to properly protect individuals' data.

Speaking to El Reg today, Dr Teague emphasised that from an academic point of view, nothing the trio was doing was either new or sophisticated.

“What this shows: de-identification of detailed individual records about people doesn't work,” she said.

As Dr Culnane said in the University of Melbourne's media release, “We found that patients can be re-identified, without decryption, through a process of linking the unencrypted parts of the record with known information about the individual such as medical procedures and year of birth.”

“Without decryption” is also an important point: there's no “hacking” involved here, and as Dr Teague told us, there's not even much by way of analysis.

Year of birth is important (and for most people easily found), “because the database index is tagged with your year of birth.”

With “one or two surgeries on particular dates, or knowing one or unusual prescriptions,” Dr Teague said, “I can write a very simple database query to identify you”.

Open government boundaries

Dr Teague said the simplicity of re-identification is a wake-up call for a debate about limits to what governments release as open data like health, tax, welfare, or census records.

In short: while publishing aggregate data (“14,000 births in Victoria”, for example) is safe, individual records should be protected.

In individual record is “not something that can be put back in the box after it's been on the Internet … What this shows: de-identification of detailed individual records about people doesn't work.”

Researchers, she said, should only have access to that level of research data in a secure environment, and those researchers need have it drummed into them that the data is re-identifiable.

“The idea that the government can make open all the data about people is just wrong.”

She added that the government's attempt to prohibit re-identification research (the legislation has not yet passed) was “a misguided effort” that “prohibited the public demonstration that there is a problem, but didn't address the problem.

“That's not good for improving the science of privacy, and it's not good for public debate.” ®

 

Similar topics

Similar topics

Similar topics

TIP US OFF

Send us news


Other stories you might like