Phishing for numbers
The percentage of lost items is not the only number regularly taken out of context. For example, consider the question: how much email is spam? In 2005, values from respected experts ranged from 70 per cent to 95 per cent. There was no consensus among experts, but all of the numbers sounded "bad."
Today, some companies no longer report the "percentage of spam" - they only report raw values (PDF). The only thing we really know is that it is a big number. But, we don't know what the number is (such as, 86 per cent) or the accuracy range (a five per cent margin of error?). We actually have better numbers and statistics for American Idol voting than spam volume.
The same issue arises when we ask where the spam comes from. The general consensus is that today's botnets generate a majority of spam. However, we do not actually know how big the majority is.
This counting problem also shows up in reports on phishing. Every few months the Anti-Phishing Working Group (APWG) releases their Phishing Trends Report. For example, the APWG Sept-Oct 2006 report (PDF) shows an increase in phishing emails. In fact, their reports over the last few years have shown a nearly steady increase intermixed with a few sharp increases in volume.
The problem with the APWG numbers is that they don't match other sightings. For example, Usenet's "news.admin.net-abuse.sightings" (<NANAS) is a high-volume newsgroup where people post their spam messages. NANAS receives thousands of postings per day - approximately 40,000 spam postings just for December 2006. The postings are sample spam emails submitted by people all over the world, and the samples appear to match the distribution of world-wide spam. If you don't have access to hundreds of honeypot accounts for collecting spam and want to do spam research, then NANAS is the next best thing.
Back in 2004, NANAS had literally hundreds of phishing emails posted every day. Phishing was big. In 2005, the volume dropped. By December 2006, there were 10 to 20 phishing emails posted per day. This is a significant drop compared to previous years, and it is a measurable contradiction to the APWG findings.
So what is going on? In 2004, the APWG was growing their membership and bringing in partners. This means that they were increasing their ability to capture and measure phishing emails. The growth at APWG seems to correspond with sharp increases in phishing volume. How should you interpret this? The numbers show an increase in phish sightings by the APWG, but do not necessarily indicate an increase in phishing. The numbers only mean that the APWG is getting better at seeing phishing, not that there is more of it.
In late 2004, the APWG repeatedly modified their definition of phishing, corresponding with additional increases (PDF) in volume (PDF). Was the increase because there was more of it? Or was it because they expanded their definition to include more? In any case, they do not appear to have revised all of their old numbers to match their new definitions. Thus, new months cannot be directly compared against old months since they measure different things.
What the APWG does not mention is that 2005 heralded a profound change in how most phishing operations work. Rather than sending blast-o-gram phishing emails to everyone and "hoping" that the recipient might have an account at eBay (or Citibank or Amazon or ...), phishers began spear-phishing. In spear-phishing, they use market research (and stolen email lists) to better target potential victims. For example, if you are likely to have a Bank of America account, then you will receive a BofA phish. However, if you are unlikely to have a BofA account, then today you are unlikely to receive a BofA phish (maybe one a month or less, not the one-a-week like you'd see a year ago).
This trend of directed phishing actually started in 2004, when phishers began to target based on countries. For example, Wells Fargo does not exist in the United Kingdom, so they stopped sending Wells Fargo phish to Blueyonder accounts (a UK ISP). Then they started narrowing by state. For example, if you are likely in Arizona then you are more likely to receive an Arizona Credit Union phish. They can guess where you are based on the forums you use. If you post in a Tucson forum or write about Flagstaff and Phoenix, then you might be in Arizona.
Today, there are very few blast-o-gram phishing e-mails. I'm measuring one to two per month per honeypot account. That's down from eight per account per month in November 2005 and 15 in October 2005. Other people measuring phishing volume may have different raw numbers, but should have similar ratios for blast-o-gram phishing. Today, nearly all phishing emails are targeted.