Updated The Global Alliance for Genomics & Health has downplayed vulnerabilities found in its genome-sharing network by two Stanford researchers.
Carlos Bustamante and Suyash Shringarpure, postdoctoral scholars in genetics at Stanford, had raised concerns about The Beacon Project's security in a paper which showed the triviality of re-identifying individuals whose data was held upon it.
However the Global Alliance for Genomics & Health (GA4GH), responsible for running the network, has said that re-identification of individuals is only possible in the "exceptional scenario" where an attacker already has access to their victims' genome – or that of a close relative – and as such was not a vector for further malicious action.
Hackers access to their victims' genome sequence could be provided "directly from your saliva or other tissues, or from a popular genomic information service" according to Stanford's initial statement. The vulnerabilities disclosed would allow a hacker possessing such info "to see if [their victims] appear in a database of people with certain medical conditions, such as heart disease, lung cancer, or autism."
"The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research" stated the researchers, whose paper titled "Privacy Risks from Genomic Data-Sharing Beacons" was published in The American Journal of Human Genetics last Thursday.
Through simulations, we showed that in a beacon with 1,000 individuals, re-identification is possible with just 5,000 queries. Relatives can also be identified in the beacon.
Re-identification is possible even in the presence of sequencing errors and variant-calling differences. In a beacon constructed with 65 European individuals from the 1000 Genomes Project, we demonstrated that it is possible to detect membership in the beacon with just 250 SNPs.
With just 1,000 SNP queries, we were able to detect the presence of an individual genome from the Personal Genome Project in an existing beacon.
Our results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori.
The paper additionally discussed risk mitigation through potential policies and standards. However, as GA4GH acknowledged in their response to the paper, anonymous pings of genetic beacons are still possible, increasing the attack surface for malicious adversaries to query the network.
Additionally, a lack of minimum beacon sizes, which would exponentially increase an attacker's resources to identify particular persons, are yet to be implemented. GA4GH stated:
In most contemporary circumstances, if someone has already obtained a person's genetic sequence elsewhere, there is not much more information to be gained by learning that this sequence also appears in a Beacon database.
However, the organisation recognises that "it is possible to obtain, perhaps illegally, a person’s genome without other information".
In those scenarios, learning that the genome is among those present in a specific institutional Beacon database could reveal sensitive information.
For example, if a database is almost exclusively associated with a known phenotype, discovering that an individual’s genome is in the database may allow inferences about the individual’s phenotype.
The organisation has stated that its mitigation efforts "adhere to the best practices outlined in the GA4GH Privacy & Security Policy, a good faith policy which allows organisations to implement their own risk-management programmes.