Dismayed by woeful AI chatbots, boffins hired real people – and went back to square one

Amazon Turk serfs have their own problems


Analysis Convinced that intelligent conversational assistants like Amazon Alexa, Microsoft Cortana, and Apple Siri are neither particularly intelligent nor capable of sophisticated conversation, computer boffins last year began testing a crowd-powered assistant embodied by Amazon Mechanical Turk workers.

The chatbot, a people-powered app called Chorus, proved better at conversation than software-based advisors, but hasn't managed to overcome poor human behavior.

Described in a recently published research paper, Chorus was developed by Ting-Hao (Kenneth) Huang and Jeffrey P. Bigham of Carnegie Mellon University, Walter S. Lasecki of the University of Michigan, and Amos Azaria of Ariel University.

The researchers undertook the project because chatbots are just shy of worthless, a sorry state of affairs made evident by the proliferation of labelled buttons in chatbot interfaces. It was hoped by businesses the world over that conversational software could replace face-to-face reps and people in call centers, as the machines should be far cheaper and easier to run.

The problem is simply that natural language processing in software is not very good at the moment.

"Due to the lack of fully automated methods for handling the complexity of natural language and user intent, these services are largely limited to answering a small set of common queries involving topics like weather forecasts, driving directions, finding restaurants, and similar requests," the paper explains.

Jeff Bigham, associate professor at Carnegie Mellon's Human-Computer Interaction Institute, in a phone interview with The Register, said, "Today, if you look at what's out there, like Siri, they do a pretty good job using specific speech commands. But if you want to talk about anything you want, they all fail badly."

Bigham and his colleagues devised a system that connects Google Hangouts, through a third-party framework called Hangoutsbot, with the Chorus web server, which routes queries to on-demand workers participating in Amazon Mechanical Turk.

Chorus is not the first project to incorporate a living backend, the research paper acknowledges, pointing to projects like VizWiz, which crowdsources help for the blind. Its aim is to explore the challenges of deploying a crowd-based system and to suggest future avenues of research for improving conversational software.

Real people, it turns out, are fairly adept at extemporaneous conversation, even if they're basically meat-to-metal bridges for Google Search queries in Chorus.

During the test period last year, 59 people participated in 320 conversations, which lasted more than 10 minutes and involved more than 25 messages on average. A lengthy sample exchange presented in the paper details a conversation about the number of suitcases a person can take on a plane from the US to Israel. It reads like a call center transcript.

The average cost of each HIT – Amazon Mechanical Turk terminology for a task – came to $5.05. The average cost per day was $28.90 total.

So far so good. But while people may have an edge with words, they bring with them their own set of problems.

Next page: Time out

Broader topics


Other stories you might like

  • AMD touts big datacenter, AI ambitions in CPU-GPU roadmap
    Epyc future ahead, along with Instinct, Ryzen, Radeon and custom chip push

    After taking serious CPU market share from Intel over the last few years, AMD has revealed larger ambitions in AI, datacenters and other areas with an expanded roadmap of CPUs, GPUs and other kinds of chips for the near future.

    These ambitions were laid out at AMD's Financial Analyst Day 2022 event on Thursday, where it signaled intentions to become a tougher competitor for Intel, Nvidia and other chip companies with a renewed focus on building better and faster chips for servers and other devices, becoming a bigger player in AI, enabling applications with improved software, and making more custom silicon.  

    "These are where we think we can win in terms of differentiation," AMD CEO Lisa Su said in opening remarks at the event. "It's about compute technology leadership. It's about expanding datacenter leadership. It's about expanding our AI footprint. It's expanding our software capability. And then it's really bringing together a broader custom solutions effort because we think this is a growth area going forward."

    Continue reading
  • Zscaler bulks up AI, cloud, IoT in its zero-trust systems
    Focus emerges on workload security during its Zenith 2022 shindig

    Zscaler is growing the machine-learning capabilities of its zero-trust platform and expanding it into the public cloud and network edge, CEO Jay Chaudhry told devotees at a conference in Las Vegas today.

    Along with the AI advancements, Zscaler at its Zenith 2022 show in Sin City also announced greater integration of its technologies with Amazon Web Services, and a security management offering designed to enable infosec teams and developers to better detect risks in cloud-native applications.

    In addition, the biz also is putting a focus on the Internet of Things (IoT) and operational technology (OT) control systems as it addresses the security side of the network edge. Zscaler, for those not aware, makes products that securely connect devices, networks, and backend systems together, and provides the monitoring, controls, and cloud services an organization might need to manage all that.

    Continue reading
  • For the average AI shop, sparse models and cheap memory will win
    Massive language models aren't for everyone, but neither is heavy-duty hardware, says AI systems maker Graphcore

    As compelling as the leading large-scale language models may be, the fact remains that only the largest companies have the resources to actually deploy and train them at meaningful scale.

    For enterprises eager to leverage AI to a competitive advantage, a cheaper, pared-down alternative may be a better fit, especially if it can be tuned to particular industries or domains.

    That’s where an emerging set of AI startups hoping to carve out a niche: by building sparse, tailored models that, maybe not as powerful as GPT-3, are good enough for enterprise use cases and run on hardware that ditches expensive high-bandwidth memory (HBM) for commodity DDR.

    Continue reading
  • Now Amazon debuts an AI programming assistant – CodeWhisperer
    Did they get GitHub Copilot to write it?

    Amazon at its re:Mars conference in Las Vegas on Thursday announced a preview of an automated programming assistance tool called CodeWhisperer.

    Available to those who have obtained an invitation through the AWS IDE Toolkit, a plugin for code editors to assist with writing AWS applications, CodeWhisperer is Amazon's answer to GitHub Copilot, an AI (machine learning-based) code generation extension that entered general availability earlier this week.

    In a blog post, Jeff Barr, chief evangelist for AWS, said the goal of CodeWhisperer is to make software developers more productive.

    Continue reading

Biting the hand that feeds IT © 1998–2022