Biometric ID - are they counting on your fingers?
Readers give it the eyeball...
Letters Three ID card-related stories in the past week drew attention to the issues of accuracy and security in biometrics. Slightly unexpectedly, Jerry Fishenden added Microsoft to the ranks of those critical of the UK ID card scheme, worrying that it could actually increase ID fraud. Jerry, whose full article can be read here, is concerned that the existence of a centralised biometric database could have a honeypot effect, while the widespread use of biometrics could spawn "new hi-tech ways of perpetrating massive identity fraud".
Nigel Sedgwick of Cambridge Algorithmica (who's been in a touch a lot this week - more shortly) writes:
It's a struggle to understand the relevance of the comments by the MS man.
Just about all of any individual's biometric samples are (as envisaged for the NIDS) on continuous display: face for anyone, iris for IR video capture, fingerprints left all over the place.
The additional risk from the NIR is, as far as I can understand, "just" that a (stolen) copy of the whole biometric database could be used to find people with biometrics sufficiently like those of any particular bad guy/gal (though most likely only of a single biometric modality or instance), that those people could be particularly good targets for identity fraud. However, if the whole NIR database of biometrics is a closely guarded secret, it is not particularly likely to be compromised. Protection approaching that of the Trident launch codes might well be appropriate, or that of our military and diplomatic cipher keys (the old ones still being very useful even following change).
Next, why would anyone be allowed access to more than 1 bit of information on each access to the NIR. That bit being whether the whole of any provided enquiry was actually an adequate match with the information held on the NIR for the person in question (with or without biometrics). With suitable protection against repeated enquiries, it would be difficult (from very to extremely) to build up knowledge on a registered person.
Which are perfectly valid points - you leak your biometrics all over the shop, so they're not exactly secret, and the NIR is intended to be such a critical piece of infrastructure that it at least ought to be highly secure. But we're not sure this really deals with Fishenden's worries. Critical systems secured by biometrics have not yet become a sufficiently big and attractive target for us to know how much of a threat the spoofing of biometrics might present. Nor do we know about the viability of biometrics as strong ID that could be applied on a widespread basis - for example, use of fingerprints in credit card transaction. Note that in this example fingerprint and PIN is used. In the world as it is now, PIN, personal details and personal plastic can all be obtained; so, will biometrics present a difficult extra barrier to surmount, or might to all go horribly wrong, and make the situation far worse? At this stage, Fishenden's worries seem justified to us.
The security of the NIR isn't something there's a lot of point in arguing about before it's built. Critics point to the Government's poor record on IT projects, the Government insists it knows what it's doing and it will be secure, and then you move into the 'oh yes... oh...' panto argument. Nigel's argument, we however submit, is based on what it would be sensible for the Government to do, rather than what it actually will do. Documentation published by the Home Office for potential contractors suggests to us that what it does will not be wholly sensible. It appears (see Procurement Strategy Market Sounding) to envisage the NIR being hosted by the private sector in two secure locations. It envisages (see Spyblog for some commentary) 256 Government departments and 44,000 private sector organisation being accredited to use the NIR and estimates (on what basis we know not) an initial volume of verifications at 165 million a year.
And it also intends to build a secure Internet login system that will allow all ID card holders to check all of the data held on them in the NIR, simply and easily. Which will mean no biometric check, probably just PIN security for accesses from consumer clients of dubious (we flatter Joe Public here) security. The political need to assuage public fears about Big Brother could conceivably end up building a handy ID theft tool here, and the verification system could turn out to be the most attractive target in the network. Jury still out here too, we feel, with new reasons to worry appearing regularly.
The accuracy of biometric checks has a clear relevance here. Home Office Minister Tony McNulty opened the bidding with confused and confusing claims about the system using 13 biometrics, and therefore being really secure. A helpful soul pointed us at some work done by John Daugman showing that this needn't always be the case. We stress the 'needn't always', because as our mathematician detectors told us before we even published the piece, this one has the potential to become very confusing.
So first, a quick gloss. The Daugman piece we referred to describes how, if you're applying decision rules to two tests independently, then you may find yourself getting a worse aggregate result than if you'd just used the stronger of the two tests. There are however other mechanisms you can use in order to be able to achieve greater levels of accuracy by using multiple tests, and this is discussed by Nigel Sedgwick here.
We'll get back to Nigel (again), but quite a lot of other people had something to say about the matter:
Hello, the article is not really accurate. It is true, that combining _2_ test will result in the average, and leading to boost the failure rates. The conclusion that using _multiple_ sensors with majority decisions will be even worse (or worse than a single test), is NOT true. (Ask Airbus, NASA ...) Though of course also this systems could fail because of the majority failing (see Minority Report ;) ), but the possibility would be less than a single single or double system. Of course this is because of the majority decision. Combining multiple tests with AND will lead to higher error rates. Sven Deichmann
Mr Daugman is trying to force water on his watermill.
An obvious solution for this problem is to apply weaker test result after a stronger one just in order to distinguish between possible multiple identifications. As a result of a weaker test is not any more an independent event and even better as a group of possible individuals is much smaller, probability of correct recognition will improve. Sava Zxivanovich
Olin Sibert raises the issue of forged biometrics (also recently raised by Viisage announcing a 'liveness detector' to stop people using pictures instead of real faces), and the dangers of unattended readers:
John Daugman's math is sound, but the article perpetuates the myth that biometric systems can be characterized in terms of just two numbers: False Accept Rate and False Reject Rate. This is true in an academic setting with no adversaries, but in the real world, one also has to worry about the use of forged biometrics, which is often a much bigger exposure than random false acceptance. A lost or stolen ID card, which has presumably been handled by the very fingers that it will allegedly be verifying, provides ample raw material for simple and inexpensive forgery attacks. Faces and irises are more resistant to this attack (you have to observe the person, not just find his card), but forgery is still quite practical, particularly in situations where the biometric reader is unattended. It's a good thing we don't leave faceprints on everything we glance at. Olin Sibert
We don't? You clearly haven't been on the London Underground recently, Olin.
Your final paragraph hits the spot: what really matters is what you consider "a test" and how you combine tests. The basic principle of "more info is better" always holds as long as the info is combined properly--weighted with confidence limits, mixed in properly, and so on--rather than having each piece of information arbitararily assigned "pass" or "fail."
For example, imagine each whorl of a fingerprint as a separate chunk of information. It would be a nightmare to decide if each whorl "passes" or "fails" independently; it's clearly much better to put them all together and form a single fingerprint-level pass/fail from the ensemble.
So I'm guessing that if the two biometric measures you speak of were combined properly BEFORE deciding pass/fail, the final answer would be better than either one alone. But giving a separate, equal veto to the weaker measure is cleary (ahem) "sub-optimal."
It's a pleasure to see such important and subtle issues aired in public. Good job!
Eddie Edwards launches a rather more complex explication of the application of three or more biometric tests: In other words, if you use two biometrics, and users must pass on both, then you're going to fail more people (that's the whole point), so more people will be failed wrongly.
It's not so much counter-intuitive as non-obvious. The idea that tightening up one aspect of a process means loosening up a complementary aspect is nothing new - and that this applies to false positive and false negative rates has been suspected by a few of us for some time. For instance, it is widely believed that if you make courts less likely to let guilty people go free, then they will be more likely to jail innocent people. (Note that courts operate on the AND policy - all jurors must agree.)
What is also well understood is the power of the majority vote. With only two biometrics the only sensible combining functions are AND and OR. AND gives the best false positive rate with the worst false negative rate, and OR does the opposite But with three biometrics we can combine them using the "majority vote" rule, which has something of the best of both worlds. In the majority vote case with three biometrics (see end of mail for derivation), the false negative rate is about as good as the product of the two worst tests. The false positive rate is derived similarly. This means that combining a test with a FP rate of 1/1,000 with another two that have FP rates of 1/100 gives a combined rate of about 1/10,000. (But it also means that combining a 1/100,000 test with two 1/100 tests still gives an overall rate of 1/10,000, so adding biometrics can still make things worse.)
As the number of biometrics goes up, the number of false positive rates that should be multiplied only goes up half as fast, so if you check 9 fingers, you have the worst false positive rate to the power 5 compared to checking only 1 finger, and the reliability becomes very good indeed (e.g. 1/100 goes to 1 in ten billion). (Replace "finger" with "small region of pixels from the iris/fingerprint" and this is more or less how a single biometric scan works in the first place.)
So naively combining a pair of biometrics doesn't work. Naively comparing three biometrics works better. And ten fingers does the trick.
(It also seems that a jury of 25 people each with an equal vote would be as effective at keeping the innocent at liberty as the current system, but would acquit wrongdoers far less often.)
Quite where this leaves the practicality of a national ID card scheme, I don't know!
* A false (positive/negative) occurs for three voters A, B, C for the following four configurations under a majority system:
p(Afalse)*p(Bfalse)*p(Cok) + p(Afalse)*p(Cfalse)*p(Bok) + p(Bfalse)*p(Cfalse)*p(Aok) + p(Afalse)*p(Bfalse)*p(Cfalse)
the total probability then being
(p(Aok)/p(Afalse) + p(Bok)/p(Bfalse) + p(Cok)/p(Cfalse) + 1) * p(Afalse)*p(Bfalse)*p(Cfalse)
the sum in brackets on the left is dominated by (the reciprocal of) the *smallest* error probability. This then cancels with the product on the right, so the overall probability is approximately the product the two *largest* error probabilities
Which seems an appropriate point to get back to Nigel Sedgwick. Nigel points out that Daugman's paper itself "acknowledges that score level fusion is not subject to the limitations he describes for decision level fusion", and points to his web site (referenced above) for a detailed explanation. The bottom line of this seems to us to be that "from a fundamental theoretical viewpoint... Any number of suitably characterised biometric devices can have their decision scores combined in such a way that the multi-modal combination is guaranteed (on average) to be no worse than the best of the individual biometric devices. In practice, there will always be an improvement from multi-modal combination." Nigel accepts that there are practical difficulties, but that they should not be viewed as overwhelming, and this is where we possibly begin to part company with him.
Nigel points out that the Government has experts on board who know perfectly well what needs to be done in order to deploy multiple biometric tests effectively, and we accept this. But we have problems convincing ourselves that what ought to be done will be done, or that the theoretical ideal can be deployed practically in the field. A standard test using all three biometrics is clearly achievable if it's a low-volume 'special purposes' test, but it can't rationally be deployed (or survive) in a system intended for everyday ID verification in everybody's life. For this, maybe you end up just using a PIN (i.e. no biometrics, and don't we have PIN security already?) and/or a single biometric, possibly fingerprint set to fairly relaxed tolerances. In other circumstances you might use more than one, but then you're starting to get into circumstances in which you're having to vary the way you apply decision rules depending on location, required security level and equipment available. If one component breaks, do you just go ahead with what's left (in which case you're maybe doing weird things to your rules), do you apply a different set of rules, or do you just count everything as broken? How do you cope with the impact of environmental variations (e.g. bad lighting or sweaty fingers) on accept/reject rates?
Factors such as these probably can be planned for, but it seems to us that they will add complexity that might well make the practical difficulties overwhelming. And the Home Office's Market Sounding doesn't inspire confidence here. The limited testing of the technology the Home Office has done so far didn't cover the combining of scores across the three biometric tests, and the Market Sounding roadmap anticipates a smallish test of biometrics lasting three months, commencing once the ID Cards Bill is passed by Parliament. One would hope these tests would include some evaluation of decision making processes in multiple test scenarios, but that's not something we'd put money on. On the upside, it doesn't seem likely to us that the Government would do something as daft as running one general 'front line' test, and then falling back on a second test in the event of failure by the first. Not deliberately, anyway.
We'll leave you with a quote from last week's third reading of the ID Cards Bill, where Home Office Minister Andy Burnham responds to questions about accuracy in multiple tests, and demonstrates how fully on top of the facts he is: "We will proceed to a full technology trial if the Bill receives Royal Assent. In the interim, I refer the hon. Gentleman to the report by the National Physical Laboratory, which examined the matter in detail and concluded that biometric systems could be used in the way in which we propose [it didn't - Ed]. I also refer him to experiences in the United States, where such systems are already in widespread use [they're not - not on this scale, in this way - Ed]. The technology is not new and coming to us only now, but established and used today throughout the world to prove people's identities." ®