Obvious candidates for ML are questions of credit repayment likelihood, insurance risk, and propensity to buy, as well as, on social media, the stories most likely to be of interest.
The problem? Machines learn whatever there is to learn, and sometimes they get it very wrong. This is because whatever the underlying process, ML typically involves two stages. The first – training – involves presenting cases from which the learning machine seeks to extrapolate a pattern of independent factors predicting the key dependency.
Some two decades ago a story, possibly apocryphal, doing the rounds of the neural net community told of a failed attempt by the US military to distinguish friendly from enemy tanks. Success. Or so they thought, until they moved from learning to test set and found the software "opening fire" equally on friend and enemy. It turned out that all their machine had learnt was the difference between pictures of tanks taken on sunny versus overcast days.
More serious, at least commercially, was a Google facial recognition fail that identified black people as gorillas. The problem arose, according to Anu Tewary, chief data officer for Mint – a web-based personal finance site bought by Intuit – and founder of the Technovation challenge for young women, because of underrepresentation of African American faces in the training set.
Similar issues, she suggests, could arise with voice recognition, where underrepresentation of women in a particular training set could result in software less able to interact with women – a vicious circle, reinforcing historic discrimination.
Insulting – or ignoring – a significant proportion of your customers is bad enough. But difficulties only multiply when it comes to using ML to determine who gets specific offers or services. The law is clear: even where protected characteristic identifies actual difference when it comes to risk or propensity, to use it as such is discrimination. Hence the equalisation of rates for insurance as well as the age at which men and women may collect their pensions.
Clearly, therefore, you should avoid inputting protected characteristics into your ML process. But how do you prevent your ML homing in on some secondary characteristic which happens to be closely correlated with a protected characteristic, thereby triggering a suit for indirect discrimination?
It may be unintended consequence, but as the House of Lords ruled in 1990, discrimination is discrimination, no matter the intention, motive, or purpose behind a discriminatory act. The temptation – the tendency – for IT professionals to separate human and machine responsibility may be strong, but the law is unlikely to approve.
Or is it? As Andrew Joint, lawyer and managing partner at technology law specialists Kemp Little, notes: "Whilst the law is not yet demanding that IT developers are accountable for all levels of their development, it is clear that legislators are looking to find ways that make sure IT developers and coding ethical responsibilities into their developments."
That means there is a growing need for IT to check what their business ML is doing. Especially as the law seems to be demanding higher standards of ML than it asks of human-originated systems. One approach involves monitoring outputs and putting in place robust systems for detecting bias. A somewhat technical treatment of this issue is to be found in a paper (PDF) published last October.
This uses the Receiver Operating Characteristic (ROC) curve – a plot of true positive rate versus false positive rate (FPR) at various threshold settings – to explore whether a particular distribution is biased according to any given (protected) characteristic. Direct marketing has long used this technique in the form of the Gains Chart.
A demonstration and slightly less technical treatment of this method is also available.
Central to this approach is the fact that it is "oblivious". That is, it considers inputs and outcomes, without digging into the underlying algorithm. That may be a pragmatic approach to avoiding issues of discrimination, but may not satisfy the EU's General Data Protection Regulation, coming into force next year. GDPR requires that where algorithms are involved in decision-making "fair and transparent processing" requires the provision of "meaningful information about the logic involved".
That may sound straightforward in theory but much harder in practise, because while some algorithms – for instance, a simple scoring system built using discriminant analysis – can be dissected in this way, this becomes increasingly difficult, verging on impossible, with other methods such as neural nets. And that's even before your ML invents its own language!
Equality, data protection, human rights: ML triggers legal compliance needs in all these areas. And that is even before we get into ethical territory.
How should we respond to the issue of "unknown unknowns", for instance, where ML uncovers something that we didn't know was even out there to be known. In 2012, US retailer Target identified the fact that one of its customers – a teen girl – was pregnant before her father was aware, based on changes to her purchasing habits.
No laws were broken. Yet it is not hard to imagine other circumstances – for instance, changing habits indicate critical illness – where society is likely to be uneasy at the power of ML to discover things.
Final word, therefore, to Catherine Flick, senior lecturer in Computing and Social Responsibility at De Montfort University: "Discrimination is not an easy issue to deal with, but machine learning and AI developers should not use this as an excuse to avoid addressing it.
"Developers must always take responsibility for the algorithms they create and seek to ensure that they serve the public good, through audit trails of decision making, thorough testing and training, and inclusion of diverse stakeholders during the development process to ensure that the goals, aims, and potential consequences of the algorithms are thought through and socially responsible." ®