Microsoft researchers have rubbished figures from cyber-crime surveys, deeming them subject to the types of distortions that have long bedevilled sex surveys.
It's well enough established that men claim to have more female sexual partners in sex surveys than women claim male partners, a discrepancy that can't be explained by sampling error alone. All it takes is for a few self-styled Don Juans to hopelessly distort the figures.
Similarly, refusal rate and small sample sizes in cybercrime surveys mean that cybercrime surveys tend to get dominated by a minority of responses, normally those who have or think they have lost a great deal as a result of hacking or malware attack, and are vocal about it.
Unverified self-reported numbers that come from such people are used as the basis for calculating losses that are based on, at best, guesstimates.
"Far from being broadly-based estimates of losses across the population, the cyber-crime estimates that we have appear to be largely the answers of a handful of people extrapolated to the whole population," Miscosoft researchers Dinei Florencio and Cormac Herley argue in the paper Sex, Lies and Cybercrime Surveys.
"Cyber-crime, like sexual behavior, defies large-scale direct observa- tion and the estimates we have of it are derived almost exclusively from surveys," they add.
It's only human nature to want to get a handle on the size of any economic or social problem, hence the understandable hunger for figures on cybercrime losses. While you can go some way towards gauging the spread of a virus or the volume of fraudulent phishing emails, it's fraught with difficulties to try on map such raw figures to actual losses.
This idea is seldom challenged but often overlooked by security industry firms who commission surveys of questionable methodology to talk up cybercrime losses in the hope of hoping to sell more kit, essentially by putting the frighteners up potential customers. Some customers at least are wise to such ploys, and disregard it as simply part of the rough and tumble of the security marketing game.
The problem with cybercrime figures comes when they are presented as methodically researched before being used to lobby either senior corporate executives or politicians for extra security spending and the like. Using these estimates to drive policy decisions, when they are unreliable in the first place, is almost certainly going to to steer decisions towards a misguided outcome.
Florencio and Herley note that the sketchy practices that persist in security surveys are among those security-conscious coders are taught to avoid. "'You should never trust user input' says one standard text on writing secure code," they write.
"It is ironic then that our cyber-crime survey estimates rely almost exclusively on unverified user input. A practice that is regarded as unacceptable in writing code is ubiquitous in forming the estimates that drive policy."
The paper (pdf), prepared for the Workshop on the Economics of Information Security, alleges that cyber-crime loss estimates are more often than not largely derived from the unverified self-reported answers of one or two people.
The researchers (correctly) state that others have reached the same dim view of cybercrime guesstimates before starkly concluding: "Cyber-crime surveys… are so compromised and biased that no faith whatever can be placed in their findings". ®