Britain's map maker is demonstrating that machine learning isn't all hot air – but has discovered just how much donkey work is involved.
The Ordnance Survey set up an ML experiment to identify roofs from its remote sensing data (satellite and aerial imagery), and found the trial reached 87 per cent accuracy within a week, from a standing start. That's still less accurate than humans (at 95 per cent), but that's not the point – it could process thousands in the time it takes a human to do one.
"Your mission, should you choose to accept it, is to classify this roof..." [Source: Ordnance Survey]
"It does have the potential to revolutionise our operations," an OS spokesman enthused. The week-long prototype isn't ready for deployment because it isn't yet accurate enough, but it promises to accelerate visual data analysis.
Roofs fall into three main categories – gabled, hipped, or flat – although there are some that are fantastically rare, such as the saddle roof. You may wonder why anyone would care. Insurance people do: they use OS data to determine insurance levels. The laborious process of labelling 20,000 roofs was crowdsourced to OS employees. In all, five separate data sources were used and thrown into the mix.
The challenge was one of classification rather than identification, OS research scientist Charis Doidge told us.
"We already have the polygons from our data," she said. That meant the model wasn't struggling to identify roofs all over the map. "Segmentation is an issue," Doidge agrees. One discovery she found was that removing the extraneous pixels around the polygon sharply improved recognition.
It isn't the first time OS has used ML to augment its pattern recognition. The agency adopted a similar, practical approach when giving ML a spin before.
Suck it and see
One OS dataset is used by the Rural Payments Agency to determine subsidies. One particular subsidy rewards farmers for planting and maintaining hedgerows – and this data is mapped by the OS. Analysts look for changes in the hedgerows, as this can likely mean changes in payments. So automating and flagging up changes can be a boon.
Potentially, at least. The OS and RPA ran three experiments with mixed results: some promising, some not.
Source: Ordnance Survey
"The results prove that automatically identifying changes to certain types of Land Parcel Boundaries – especially, drains/ditches/dykes, walls, or fences (which are smaller than the spatial resolution of the input imagery data) – is very difficult even using specialist edge detection tools such as those employed within this work," the RPA's postmortem concluded.
"These types of features have the lowest change detection accuracy and have a detrimental impact on the overall change detection accuracy."
The analysis found that a lot of pre-processing was required to make the data suitable for the ML algorithms. The algorithms were then confused by lighting (the angle of the Sun) and the time of year the source data was taken. But this illustrates a dilemma for anyone deploying state-of-the-art AI today, when the paint isn't dry on the art: you don't know whether it's going to work until you've tried it.
To illustrate how early days it is, the "Father of Deep Learning", Geoffrey Hinton, recently revealed a radical approach to pattern recognition using neural networks. In the 1980s, Hinton had pioneered the theoretical application of backpropagation techniques to neural nets. This is an approach which, when allied to large data sets, has finally borne some practical fruit in recent years.
Hinton declared that backpropagation was no longer sufficient to take AI forward, and revealed what he says is a superior technique – capsules. The approach has been welcomed as taking AI researchers "out of a rut". Hinton's new approach assumes the pattern being sought is already in the image, "[using] neural activities that vary as viewpoint varies rather than trying to eliminate viewpoint variation from activities". (Human-readable PDFs here and here.)
Right now Hinton's focus is on 3D objects, so it isn't directly useful to the OS.
It's early days in pattern recognition, and what's emerging is the need for cleanly labelled data. But for mappers, whose raw data is visual, the prospect of speeding up the work seems promising. ®