The race is on to apply machine learning to biology. The starting gun was fired in 2002 when research company Correlogic stunned the medical world with the announcement of a vastly improved test for detecting ovarian cancer. The new test was simple - a few drops of blood are all that's required - yet reliable. What made it truly remarkable was that the test was discovered by machine. This formed a key theme at this month's International Joint Conference in AI (IJCAI) at Edinburgh.
The computer program BLAST, which searches genetics databases looking for similar gene sequences, is now ubiquitous in genetics research. It suggests possible relations between genes, and is used as a tool for focusing research. In Seattle, a computer system recently deduced a complicated sequence of links between a dozen genes and the dangerous skin cancer melanoma. This particular link was already known about. The remarkable thing is that the machine discovered it independently with minimal human help. Researchers Zhang, Baral and Kim of Arizona University programmed the learning algorithm, pointed it at a biology database, and – with a few clues - let it loose.
This is the new mechanised biology, created by a combination of developments. Modern biology - especially genetics, molecular biology and medicine – throws up vast amounts of data. These are now available in various vast international databases. Put this together with advances in statistical artificial intelligence (AI), and the conditions are ripe for the creation of a new subject. Known as bio-informatics (the word has become ubiquitous in AI project proposals), it is the application of computers to biology.
Correlogic's test is slowly working its way through the approval process. Meanwhile, there have been few other clean-cut successes. This hasn't stopped researchers flocking to the field, as Edinburgh's recent International Joint Conference in AI showed.
Medicine attracts the most attention. There is interest from practically every area of AI. One striking project is the robot Penelope, who in June this year became the first autonomous robot to take part in an operation. Penelope manages the surgical instruments in an operating room, responding to voice requests such as "scalpel". Other work includes a system under development by Professor Jim Hunter's Scottish team that will automatically deliver oxygen to babies in intensive care.
There are several hurdles to be overcome in the new science. One is that machine learning techniques depend upon large amounts of high-quality electronic data. Without this lifeblood - still in relatively short supply - they are useless. This is especially the case in medicine, were data must often be collected by doctors and nurses as part of their day-to-day work.
Like all sane human beings, the medical profession do not really want to have anything to do with computers. In particular, they do not want to consult computers or input data (preferring clipboards and paper), they use conflicting clinical systems, they use words differently, and they enter data reluctantly and unreliably. In short, they are not machines and have no desire to become machines. It possibly doesn't help that many bio-informatics projects aim to make them redundant.
Slowly, the researchers have learned to listen to their would-be users. Modern medical AI is often called by the less-threatening name of 'decision support', and at least tries to take into account the feelings and desires of the people who might have to use it.
Psychology by numbers
Computers are also being used to unlock that warped and weird construct, the human brain. Professor Daniel Wolpert gets schizophrenics to hit each other. He then models their behaviour on computer. He's testing a theory on how our perception is filtered by what we expect. Known events are largely ignored, whilst unexpected things grab our attention. One consequence of his model is that in quarrels, we consistently underestimate our own force (which we know about and expect). This can easily cause arguments to escalate, with both parties convinced that it is the other person who is to blame for shouting louder, or shoving harder. Wolpert demonstrated this with an experiment where volunteers exchanged taps, trying - and failing - to match each other's force. The computer predicts this - and also the surprising result that schizophrenics do better at this task. According to Wolpert, this is because schizophrenic patients are somewhat disassociated from their actions, and can be more objective.
Old-school neuroscience had to rely on head-injury victims and torturing kittens. In the 70s and 80s, Colin Blakemore's performed ground-breaking work showing that vision must be learnt. He did this by experiments such as raising kittens in vertical striped boxes (they never learn horizontal and later fall off tables as they cannot see the edge). The man is either brilliant or twisted, depending on your view of the relative importance of science versus kittens.
However the new wave of neuroscientists no longer need to abuse animals. Increasingly, scanners and computer simulations are used instead. With FMRI scanners able to pin down nerve-cell firing to within 3 millimetres, it is possible to take snapshots of roughly what the brain is doing. The reams of data produced by FMRI and EEG scans then provide the raw material for computational neuroscience.
Brave New World?
There is a sense of urgency to bio-informatics. Researchers and investors are scurrying to the field like old-time gold-rush prospectors. This may be motivated as much by the prospect of profitable patents as by scientific or medical aims. Machine-generated discoveries can still be used to patent genes - potentially privatising our biological heritage. If bio-informatics does deliver the hoped-for bio-riches, these may not be shared out.
Yesterday, for example, Correlogic was awarded a US patent on using machine-learning to detect biological states. In accordance with US patent office policy, the new patent is obscenely broad and contravenes a good deal of prior art. The company is attempting to patent not a specific treatment, but a general process ("let's use computers and lots of clinical data") for finding treatments. This raises fears that - with greed outweighing scientific principles - bio-informatics may get bogged down in endless legal squabbles. Aggressive patenting is a real menace.
The rush of modern science is exhilarating. But whilst it offers great potential, there are also grave dangers - dangers that we in the technology sector are perhaps too quick to dismiss. It is not clear what effects the new biology will have on society. The side-effects may come fast and they may not be altogether rosy. Of course, it may all turn out to be fool's gold. ®