More and more LLMs in biz products, but who'll take responsibility for their output?
ServiceNow and SAP join the genAI frenzy, but users advised to 'keep a human in the loop'
There was barely a beat before he responded. "The simple answer is no," said Jon Sigler, ServiceNow Now Platform senior vice president.
The question was whether the workflow platform vendor would take responsibility for the words produced by its newly introduced generative AI technology for HR, IT helpdesk, customer service (CMS) and coding in its Vancouver product update.
Sigler was careful to qualify his answer. "One of the things that's happening in real-time is how they're going to regulate this world. What an agent does — just like an agent before Gen AI before the Vancouver release — if an [service desk] agent were to type something to resolve a case in a wrong way, we don't take any liability for that. In turn, if the [GenAI] model were to say to an agent, this is the summarization or this is the action you take, and they blindly take that action again, that's the human being involved. There is no liability to the model," he said.
ServiceNow, which produces cloud-based workflow and service desk management software, has introduced Now Assist for ITSM, CSM and HR in an effort to reduce manual tasks. The LLMs are also trained to produce case summary reports, which, in the case of HR, could include sensitive information that could later become relevant to legal cases.
The vendor has also introduced Now Assist for Creator which writes code for the Now Platform, helping development teams.
Sigler said that although the models could "hallucinate" or produce errors, they would only do so within a specific domain.
"Models make up stuff: that's just the nature of the model. We control that as much as we can internally, but it won't hallucinate on, say, something that's irrelevant to the case," he said.
The company's advice to customers is they should verify every single interaction with the LLM. "We don't want, at this point, any of our customers saying 'I'm going to take it for granted that summarization is correct'," he said.
The software allows the agent to grade the LLM's summarization, which is also capable of improving in its accuracy over time, Sigler said.
He said the time taken to produce a report could be reduced from an hour to 10 or 20 minutes.
Neil Ward-Dutton, analyst firm IDC's veep for AI and automation practices, said: "At this stage, it's unreasonable for vendors to take on all liability for all risks that might accrue to LLM implementers, but there are some risks that I think vendors can reasonably take responsibility for."
Risks can arise from "upstream" activities, including the technologies a vendor, or its suppliers, build before providing the product to a customer. Vendors should take responsibility for these risks, and Microsoft's new Copilot Copyright Commitment does exactly that, Ward-Dutton said.
"I think it's much more problematic for 'downstream' risks – risks that might arise from usage of these systems by the customer. It's widely known and accepted that the current state-of-the-art GenAI systems cannot guarantee 100 percent accuracy in task completion. Vendors will ensure that customers 'own' any downstream issues arising from errors by including appropriate words in end-user license agreements," he said.
Rowan Curran, Forrester senior analyst, said Generative AI has a "tremendous potential" to help service desks.
"For example, using LLMs to generate post-call summaries from transcripts is already being used at scale in some companies and they're seeing benefits because the LLMs not only produce the summaries, but can also prepare data for further analytics by extracting topics and keywords," he said.
This week saw enterprise software giant SAP introduce LLM-based "copilot" Joule to applications in HR, finance, supply chain, procurement, and customer experience – as well as into SAP's Business Technology Platform.
- Cloudflare loosens AI from the network edge using GPU-accelerated Workers
- GitHub Copilot, Amazon Code Whisperer sometimes emit other people's API keys
- Pulitzer Prize winning author Michael Chabon and others sue OpenAI
- Alibaba set to unleash AI that offers financial advice – do you feel lucky?
The company claimed Joule would transform the SAP user experience by answering employees' questions with intelligent answers drawn from the wealth of business data across the SAP portfolio and third-party sources, retaining context.
However, Forrester's Curran warned that organizations adopting the technology need to be careful with how they manage the non-deterministic nature of the models, "maintaining a human in the loop" is best practice.
"Agents need to review AI-generated summaries. Having extensive and ongoing testing of more automated systems is crucial for successful adoption of these capabilities," he told us.
Curran pointed out that humans could make mistakes, too, when completing equivalent tasks. ®