This article is more than 1 year old
Triple-headed NHS privacy scare after hospital data reach marketers, Google
'Pseudonymised' data mapped, stored in Google BigQuery, sparking offshore data panic
The UK's National Health Service (NHS) and the NHS Information Centre are riding out a three-pronged privacy storm.
The first privacy incident starts with this PA Consulting document titled “Placing the patient at the centre of healthcare: PA report on the future of healthcare.”
On page eight, a section titled “The cloud can transform the way the NHS connects and uses data” the discussion turns to “an archive called Hospital Episode Statistics (HES)” that contains “a huge amount of detailed data” about the activity of “every Hospital in England.” The data set occupied a one-terabyte disk drive and as PA Consulting tried to ready it for analysis they found it “took several hours” just to load it into “a traditional Microsoft SQL database”.
In an attempt to hasten analysis of the document, here's what happened next:
“The alternative was to upload it to the cloud using tools such as Google Storage and use BigQuery to extract data from it. As PA has an existing relationship with Google, we pursued this route (with appropriate approval). This showed that it is possible to get even sensitive data in the cloud and apply proper safeguards.”
The results of this approach were good, from a technical point of view at least. PA's people report that “queries that took all night on our servers were returned in under 30 seconds using BigQuery” and “Within two weeks of starting to use the Google tools we were able to produce interactive maps directly from HES queries in seconds.”
Early this week, the PA Consulting document was brought to the attention of Sarah Wollaston, MP for Totnes, Brixham and the South Hams. Here's her reaction to the news.
A privacy panic has ensued, with PA Consulting accused of sending personal data offshore and imperilling privacy.
The NHS Information Centre's response to Wollaston came in this statement, which offers this explanation:
“The NHS Information Centre (NHS IC) signed an agreement to share pseudonymised Hospital Episodes Statistics data with PA Consulting in November 2011.
This included Hospital Episode Statistics on Admitted Patient Care (1999/00 to Provisional 2011/12), Outpatient (2003/4 to Provisional 2011/12) and A&E (2007/8 to Provisional 2011/12). This agreement lasted to November 2012 and was amended in December 2012 to extend to November 2015.
The agreement obliged PA Consulting to abide by conditions to protect the confidentiality of the data, including restricting the data to a named list of individuals, a prohibition on sharing any information with risk of identifying individuals and a requirement to destroy the data after the agreement end date.
PA Consulting used a product called Google BigQuery to manipulate the datasets provided and the NHS IC was aware of this. The NHS IC had written confirmation from PA Consulting prior to the agreement being signed that no Google staff would be able to access the data; access continued to be restricted to the individuals named in the data sharing agreement.”
The second incident has been highlighted by privacy group Medconfidential, which in this post leads users to claims made by geographical information systems and marketing outfit Beacon Dosdsworth to the effect that it can use HES data to “better understand the health needs of local communities and populations and identify trends and patterns in order to target health improvement more effectively.”
Beacon Dodsworth has updated the page we've linked to above with the following text:
“Following recent concerns regarding our access to HES data and its use in P2People & Places, we would like to clarify that we have never had direct access to the raw HES data.”
The third privacy incident concerns an outfit called Earthware that writes here that it created “a demo online map we had created to demonstrate how HES data might be displayed in a mapping environment.”
That map quickly earned the ire of the HSCIC, which asked for it to be taken down. Earthware did so and offered the following statement:
Earthware immediately withdrew this map from our website upon request from the HSCIC and popped out this statement.
“Earthware would like to clarify the following:
- The map displayed mock data held by a third party who provided this data to Earthware via a web API
- We do not hold nor have we ever held HES data on our servers
- No patient identifiable data was ever displayed on the map
Earthware are confident that we have not breached any legal or regulatory rules regarding the licencing or publication of HES data.
We will continue to co-operate fully with the HSCIC if required.”
The HSCIC has issued its own statement about the incident, writing “We are investigating urgently the source of the data used by Earthware UK and whether controls demanded of any organisation using data have been maintained. After this investigation we will take any necessary action.”
A busy time, then, for HSCIC's spin doctors and a yet more evidence that health data is travelling rather further, and faster, than anyone imagined. ®
PA Consulting has contacted The Reg and told us the data it holds "does not contain patient name, address, NHS number or Date of Birth" and that it has "followed all the conditions specified by the NHS such as the small numbers rule and giving access to the underlying data to others."
"We applied for access for up to 12 people," the missive continues, "but in practice only four people have regularly accessed the information."
The company also explained the project by saying "Over the past two years we have run a project to show the NHS how insight can be quickly and cost-effectively generated from large volumes of health data, enabling better care for patients. PA signed a data sharing agreement to gain access to the Hospital Episode Statistics dataset from the Health and Social Care Information Centre. The dataset does not contain information that can be linked to specific individuals and is held securely in the cloud in accordance with conditions specified and approved by HSCIC. Access to the dataset is tightly controlled and restricted to the small PA project team."
This kind of work, it added, has potential to trim health care costs.