Count what, exactly?
The inability to accurately count phish and compare results with previous months is dependent on a basic definition: what should you count? For example, on 29 December 2006, NANAS recorded 17 phishing email sightings - some of the NANAS phishing posts were for phish received by the recipient up to three days prior (not everyone posts to NANAS immediately). The 17 postings represented six companies: Bank of America (nine sightings), Fifth Third Bank (three sightings), Halifax (two sightings), Nationwide, Western Union, and PayPal (one sighting each). Yet, many of these sightings actually account for what is likely the same mass mailing. For example, both Halifax sightings used the same email content and the same phishing server. This is one mass mailing counted twice. The 17 sightings likely represent eight distinct mass mailings (three for Bank of America).
Like spammers, phishers do not send out one email; they send hundreds of thousands of emails. When groups like the APWG, Websense (PDF), Ironport (PDF), and even the Federal Trade Commission (PDF) release numbers about phishing and spam, you needs to ask yourself: are they counting the raw number of emails, or the number of mass mailings?
As an aside, note that in the APWG Sept-Oct 2006 report, they state that they measure individual phishing campaigns based on unique subject lines. This does not take into account mass mailing tools that randomly modify the subjects in each email. This method also incorrectly assumes that subjects commonly used in different mass mailings (e.g., "Security Measures") are actually the same mass mailing.
Consider this alternate example: In 2006, two companies lost laptops that contained personal information. One company lost ten laptops, while the other lost six. Which is worse? At face value, ten is worse than six.
However, I can add in additional information. The ten laptops were all stolen at once, while the six were stolen over three separate occasions. Just based on this information, which is worse? Six, because it shows an ongoing pattern compared to one big mistake.
Note that I am intentionally ignoring the data loss in this example - HCA Inc compromised 7,000 people when 10 laptops were stolen, while Ernst & Young compromised hundreds of thousands of people across three separate incidents. In this case, yes - losing six laptops was worse than losing ten.
This example is analogous to the reporting of phishing and spam trends. Is 800 phish sightings bad? How many mass mailings does that represent and how many victims are estimated? The 800 sightings represent what percent of the total? Just as raw values give a sense of scope, the size of each incident, number of incidences, and estimated effectiveness of each mailing campaign also provides valuable information needed to assess risk.
With the explosive growth in identity theft, increase in botnets for spam and network attacks, and the rise in zero-day exploits (PDF), now more than ever, we need to be able to quickly and effectively evaluate risks. Unfortunately, we are only beginning to see metrics and they are not consistent. Rather than being shown threat levels, we have floating numbers without any context, respected experts citing vastly different values, and no means to compare threats. Apples and oranges make for a good fruit salad, but they do not help risk assessment.
This article originally appeared in Security Focus.
Copyright © 2007, SecurityFocus