AI's most convincing conversations are not what they seem

It's time to toss the Turing test – it's not really about the machines at all

Opinion The Turing test is about us, not the bots, and it has failed. 

Fans of the slow burn mainstream media U-turn had a treat last week.

On Saturday, the news broke that Blake Lemoine, a Google engineer charged with monitoring a chatbot called LaMDA for nastiness, had been put on paid leave for revealing confidential information.

Lamoine had indeed gone public, but instead of something useful like Google's messaging strategy (a trade secret if ever there was one) he made the claim that LaMDA was alive. 

Armed with a transcript where LamDA did indeed claim sentience and claims that it had passed the Turing Test, Lemoine was the tech whistleblower from heaven for the media. By the time the news had filtered onto BBC radio news on Sunday evening, it was being reported as an event of some importance.

On Twitter, it had been torn apart in a few hours, but who trusts Twitter with its large and active AI R&D community?  

A couple of days later, the story was still flying, but by now journalists had brought in expert comment, by way of a handful of academics who had the usual reservations about expressing opinions.

On the whole, no, it probably wasn't, but you know it's a fascinating area to talk about. 

Finally as the story fell off the radar at the end of the week, the few remaining outlets still covering it had found better experts who, one presumes, were as exasperated as the rest of us. No. Absolutely not. And you won't find anyone in AI who thinks otherwise. The conversation still revolved around sentience rather than how interesting it was.

Google has to use humans to check its chatbot outputs for hate speech, but we were back on the planet.

For future reference and to save time for everyone, here's the killer tell that a story is android paranoia – "The Turing Test" as a touchstone for sentience. It isn't.

It was never meant thus. Turing promised it in a 1950 paper as a way of actually avoiding the question "can machines think?"

He sensibly characterized that as unanswerable until you sort out what thought is. We hadn't then. We haven't now.

Instead, the test – can a machine hold a convincingly human conversation? – was designed to be a thought experiment to check arguments that machine intelligence was impossible. It tests human perceptions and misconceptions, but like Google's "Quantum Supremacy" claims, the test itself is tautologous: passing the test just means the test was passed. By itself, it proves nothing more. 

Take a hungry Labrador dog, which is to say any Labrador not asleep nor dead, who becomes aware of the possibility of food.

An animal of prodigious and insatiable appetite, at the merest hint of available calories, the Labrador puts on a superb show of deep longing and immense unrequited need. Does this reflect a changed cognitive state analogous to the lovesick human teenager it so strongly resembles? Or is it learned behavior that turns emotional blackmail into snacks? We may think we know, but without a much wider context, we cannot. We might be gullible. Passing the lab test means you get fed. By itself, nothing more. 

The first system to arguably pass the Turing test, in spirit if not the letter of the various versions Turing proposed, was an investigation into the psychology of human-machine interaction. ELIZA, the progenitor chatbot, was a 1966 program by MIT computer researcher Joseph Weizenbaum.

It was designed to crudely mimic the therapeutic practice of echoing a patient's questions back to them.

"I want to murder my editor."

"Why do you want to murder your editor?"

"He keeps making me hit deadlines."

"Why do you dislike hitting deadlines?" and so on.

Famously, Weizenbaum was amazed when his secretary, one of the first test subjects, imbued it with intelligence and asked to be left alone with the terminal.

The Google chatbot is a distant descendant of ELIZA, fed on large amounts of written data from the internet and turned into language models by machine learning. It is an automated method actor.

A human actor who can't add up can play Turing most convincingly – but quiz them on the Entscheidungsproblem and you'll soon find out they're not. Large language models are very good at simulating conversation, but if you have the wherewithal to generate the context which will test for whether it is what it appears to be, you can't say more than that.

We are nowhere near defining sentience, although our increasingly nuanced appreciation of animal cognition is showing it can take many forms.

At least three types – avian, mammalian, and cephalopodian – with significant evolutionary distance look like three very different systems indeed. If machine sentience does happen, it won't be by a chatbot suddenly printing out a cyborg bill of rights. It will come after decades of directed research, building on models and tests and successes and failures. It will not be an imitation of ourselves.

And that is why the Turing test, fascinating and thought-provoking though it was, has outlived its shelf life. It does not do what people think it does, rather it has been traduced into serving as a Hollywood adjunct that focuses on a fantasy. It soaks up mainstream attention that should be spent on the real dangers of machine-created information. It is the astrology of AI, not the astronomy.

The very term "artificial intelligence" is just as bad, as everyone from Turing on has known. We're stuck with that. But it's time to move the conversation on, and say goodbye to the brilliant Alan Turing's least useful legacy. ®

Broader topics

Other stories you might like

  • Is computer vision the cure for school shootings? Likely not
    Gun-detecting AI outfits want to help while root causes need tackling

    Comment More than 250 mass shootings have occurred in the US so far this year, and AI advocates think they have the solution. Not gun control, but better tech, unsurprisingly.

    Machine-learning biz Kogniz announced on Tuesday it was adding a ready-to-deploy gun detection model to its computer-vision platform. The system, we're told, can detect guns seen by security cameras and send notifications to those at risk, notifying police, locking down buildings, and performing other security tasks. 

    In addition to spotting firearms, Kogniz uses its other computer-vision modules to notice unusual behavior, such as children sprinting down hallways or someone climbing in through a window, which could indicate an active shooter.

    Continue reading
  • Microsoft promises to tighten access to AI it now deems too risky for some devs
    Deep-fake voices, face recognition, emotion, age and gender prediction ... A toolbox of theoretical tech tyranny

    Microsoft has pledged to clamp down on access to AI tools designed to predict emotions, gender, and age from images, and will restrict the usage of its facial recognition and generative audio models in Azure.

    The Windows giant made the promise on Tuesday while also sharing its so-called Responsible AI Standard, a document [PDF] in which the US corporation vowed to minimize any harm inflicted by its machine-learning software. This pledge included assurances that the biz will assess the impact of its technologies, document models' data and capabilities, and enforce stricter use guidelines.

    This is needed because – and let's just check the notes here – there are apparently not enough laws yet regulating machine-learning technology use. Thus, in the absence of this legislation, Microsoft will just have to force itself to do the right thing.

    Continue reading
  • Cerebras sets record for 'largest AI model' on a single chip
    Plus: Yandex releases 100-billion-parameter language model for free, and more

    In brief US hardware startup Cerebras claims to have trained the largest AI model on a single device powered by the world's largest Wafer Scale Engine 2 chip the size of a plate.

    "Using the Cerebras Software Platform (CSoft), our customers can easily train state-of-the-art GPT language models (such as GPT-3 and GPT-J) with up to 20 billion parameters on a single CS-2 system," the company claimed this week. "Running on a single CS-2, these models take minutes to set up and users can quickly move between models with just a few keystrokes."

    The CS-2 packs a whopping 850,000 cores, and has 40GB of on-chip memory capable of reaching 20 PB/sec memory bandwidth. The specs on other types of AI accelerators and GPUs pale in comparison, meaning machine learning engineers have to train huge AI models with billions of parameters across more servers.

    Continue reading

Biting the hand that feeds IT © 1998–2022