Comment How frightened would you be if you were secretly planning to get pregnant, without telling your husband, and discovered that someone had written to him telling him about it? Or, put the other way, how would you feel if you discovered your wife was pregnant only when someone dropped you a letter?
And who would that person be? It would have to be someone like Robbie "Cracker" Coltrane, right? A deep profiler with the power of a hypnotist...? Or it would have to be a Government spook. They could do it, surely, the way the world's spooks monitor all of us: easy. They tap our phones, right?
Well, the thing to remember is that it really did happen. And no, if you thought it was phone tapping, you're wrong, and it's a misunderstanding which I won't make any friends for clearing up. But you need to understand the basic principles of data mining to understand why the world of spooks and the world of search engines are about to overlap, and why you should be nervous about this.
The lesson here is one I call "The Sainsbury's Lesson" when doing presentations for technical audiences, because I was taught this by a data miner who worked for the giant British supermarket of that name.
The story, summarised, is that Sainsbury's was spending an absurd amount of money sending people promotional coupons, money-off special offers, and other junk mail to encourage them to swing by the Sainsbury's supermarket next time, rather than Waitrose or Safeway or Asda - and it was pretty hard to be sure it was actually doing any good.
The trouble was simple: they were sending girly shampoo promotions to households with six rugby-playing male students, or home improvement promotions to households with one elderly pensioner with osteoporosis, or bulk beer deals to households where they were all strictly teetotal. Not profitable stuff. And their IT staff heard about this and said: "But you don't have to do that!"
This goes back a bit before the days of Nectar, when Sainsbury's had its own loyalty card, plus it sold fuel out of its own petrol stations and ran a bank and a credit card. And the IT people said: "If you know what sort of things someone buys you can make a pretty good guess about what they may want to buy next."
The beauty of the system was that data mining requires no intellectual engagement. The Sainsbury's Lesson could be called the Amazon Lesson, or the Tivo Lesson for that matter. All you do is look for patterns. The more frequently you find a pattern, the better it is to guide you.
Take a commuter travelling from London to Reading. They can get into a car and drive. They can get into a taxi and be driven. They can take a bus. But most will get onto a train, with a choice of catching the Great Western service from Paddington, or the slower SouthWest service from Waterloo. If you want to catch them before they get to Reading, you aren't going to do it by chance.
But suppose you get information that says that they have passed through, Ascot, Sunningdale, Virginia Water, and Egham? If you know the route, you know that the next station, going west, will be Staines. Now, it's perfectly possible that they aren't going to be at Staines next. They might get off the train at Egham, take a cab to the station beyond Staines, and rejoin the train there... but seriously, what are the chances of that?