Feature Following last month’s announcement of a £1m nationwide spam drop, what now for care.data, the NHS's latest multi-million pound big data project?
Is it, as the carefully managed news release implied, merely taking its time – in fact, delaying a key project by almost a year - so as to nail issues of patient confidentiality? Or is it, already, in deeper mire, and using data protection issues as a figleaf to cover up more significant problems with system delivery?
And if it does go bad, will we ever find out why and how (and how much it cost)? Or is the new arms-length NHS wholly immune from parliamentary scrutiny?
Let’s start with the mire.
The theory behind care.data is straightforward enough. Data from all (non-dissenting) UK patients is to be lodged in a central database, from where it may be used for admin purposes, for statistical analysis by the NHS or sold on to select research companies.
It is managed under the auspices of the Health and Social Care Information Centre (HSCIC), part of the new devolved NHS England, and intended to be “a modern information service on behalf of the NHS”, using information from a patients’ medical record to improve the way that healthcare is delivered for all”.
Moreover: “The service will only use the minimum amount of information needed to help improve patient care and the health services provided to the local community”.
So far, (sounds) so good.
care.data now forms a significant part of the Secondary Uses Service (SUS), initially set up as part of the ill-fated National Programme for IT in the NHS. The ambition of using one major supplier – BT – as national application service provider has now replaced by an open data platform (ODP) approach. This is in line with the principles of the Government ICT Strategy with the separation of technical components into those, such as data storage and processing, that need to be delivered centrally, and apps for turning data into information that may more sensibly be developed to meet specific needs.
According to the HSCIC, the key components of the platform's architecture look like this:
You want to extract what?
According to a GP toolkit, published earlier this year by NHS England, the system build process starts with the General Practice Extraction Service (GPES) a centrally managed extraction service, divided into two parts: ATOS provides the extract query tool; the extractions will be carried out by the GP practice system suppliers, including EMIS, TPP, Microtest and INPS.
In addition to data set out in the main spec, the GPES will also hoover up personal identifiers: NHS number, gender, date of birth, postcode and ethnicity are among the eight criteria required.
All data will first be uploaded to a Data Management Environment (DME) within the HSCIC. Initial upload may be to HSCIC direct or to one of a number of regional Data Management Integration Centres (DMICs). From there, it is matched to an index file (HES index file): the initial data upload is then deleted; and further secondary data may be matched in, also using the HES index. Unfortunately, that appears to be most of what is known publicly.
This leaves a host of questions, such as:
- Who are the lead providers?
- Have any of them been ejected?
- Are we getting multiple platforms that may or may not be able to interact?
- And the daddy of the lot: How much is care.data costing?
The answer is, despite putting all these questions and more to the relevant parties over a period of time, few answers are forthcoming. Nor are we likely to get more information any time soon. For the Health and Social Care Act 2012 (HSCA) that established NHS England also effectively removed that body from parliamentary accountability.
The Department of Health passes such questions directly over to NHS England.
Repeated attempts to elicit comment from the office of opposition Health spokesman, Andy Burnham, MP, have also drawn a blank.
However, according to sources close to the project, technical issues are already surfacing, and not in a good way. The use of local IT providers to build the various DMICs mean that while care.data itself is up and running, data loads are not. Because different local builders mean a range of different systems architectures and a system that is currently not interoperable.
A second source claims that the GPES itself is not working as it should, and that even if it wished to, HSCIC could not presently commence uploading data. We have asked NHS England for comment on both these claims – but so far no answer. Costs, with or without the impact of any technical glitches, remain a mystery.
How much? Oh, you can't tell us...
Officially, according to NHS England: “We are not yet in a position to provide the full costs of the programme.” They are working with HSCIC to do so and “anticipate” that further information may be forthcoming in the New Year.”
However, a good starting point seems to be a briefing paper put out by the Informatics Services Commissioning Group that states that “care.data programme costs will be built on the current costs of the proposed Open Data Platform”, and that the ODP cost “at outline business case is estimated at £33m over three years”.
They add: “This figure, excludes any additional accelerator project costs, which remain to be determined” – though one such cost may be an extra £11.8m of funding that Councils will receive to support the move to a new social care data collection system, which appears to be part of the care.data ecosystem. Add to this the £1m to £2m minimum for the door drop now needed to meet the demands of the Information Commissioner.
Meanwhile, care.data is starting to encounter opposition both for the enormity of what it intends to put in place, and the somewhat hamfisted way in which it has proceeded to date. For the vision is clear: under the HSCIC, what was once a simple data warehouse for producing statistical information on patient care is to be transformed into a whole life system of universal health surveillance.
According to the GP toolkit, the amount of personal, privacy-busting information to be uploaded is massive. Categories of information include diagnoses (anything from diabetes to schizophrenia), health group (including whether a patient smokes or has high cholesterol), interventions and prescriptions.
Those concerned about the scale of information being released might be relieved to learn that “sensitive information” – that’s information relating to subjects such as termination of pregnancy, convictions and domestic violence – are to be omitted. For now.
However, NHS England is keen to “listen to” calls by patient groups to open such data up, since its current omission might be considered “stigmatising”. Or in other words, you ain’t seen nothing yet – and information currently considered too sensitive for inclusion may yet be added.
The first iteration of care.data is also relatively limited, compared to what is already in the pipeline: hospital data is due to be added a year after GP data; and social care data a year after that.
Concerns over confidentiality fall under two heads. First, despite assurances that security is “the most important priority of the HSCIC”, and that care.data “will conform to the same strict standards of data security and confidentiality that have governed the use of HES for many years”, experience suggests otherwise. Government and data security, many would argue, are mutually exclusive things, and history seems largely to prove that where security can be breached, it will be. Given the literally career-changing nature of some of the data soon to be passed around, the risk, critics argue, is not worth it.
Accidents apart, the uses that the HSCIC intends for the data have raised eyebrows. Outputs for release 1 distinguish between a range of aggregate statistics, much as before, and pseudonymous statistics: that’s data anonymised, according to the HSCIC, sufficiently to resist a “jigsaw attack”.