You got legal trouble? Better call SauLM-7B

Cooked in a math lab, here's an open source LLM that knows the law

Machine-learning researchers and legal experts have released SauLM-7B, which they claim is the first text-generating open source large language model specifically focused on legal work and applications.

In light of recent high-profile blunders in which generative AI cited non-existent cases in submitted court filings – Mata v Avianca and Park v Kim – that might seem ill-advised. The tendency of AI models to hallucinate and their uncertain data provenance would appear to be deal breakers in an industry where the stakes are significant.

But SauLM-7B's creators, affiliated with startup Equall.ai, Université Paris-Saclay and Sorbonne Université in France, and Universidade de Lisboa and NOVA School of Law in Portugal, argue there's a place for artificial intelligence help in the law.

"LLMs and more broadly AI systems will have a transformative impact on the practice of law that includes but goes beyond marginal productivity," a spokesperson for Equall.ai said in an email to The Register. "Our focus is on creating end-to-end legal AI systems guided and controlled by lawyers.

Our belief is that systems specialized for the legal domain will perform better than generalist ones

"Our belief — based on data and experience — is that systems specialized for the legal domain will perform better than generalist ones. This includes greater precision and more useful tools to help lawyers focus on what they enjoy most and do best, which is to exercise legal judgment and help their clients with advice."

Other organizations are similarly optimistic about the utility of AI assistance. Goldman Sachs last year estimated [PDF] that "one-fourth of current work tasks could be automated by AI in the US, with particularly high exposures in administrative (46 percent) and legal (44 percent) professions…" And startups like Bench IQ, Harvey.ai, and Safe Sign Technologies see a market opportunity in that sort of prediction.

Equall.ai, founded by Jorge Mattamouros, a former partner at White & Case LLP, argues that almost all legal work – research, document review and analysis, summarization, and the identification of key passages in documents – can benefit from AI.

"We believe LLMs open so many more avenues, some we see today, many still to discover," Equall.ai's spokesperson continued. "For instance, we believe that LLMs will drastically change the way we approach both data processing pipelines and data generation, which will be critical to legal applications where obtaining high-quality data is expensive and difficult to do."

The view at Equall.ai is that the inaccuracies of AI models can be mitigated.

"LLMs remain probabilistic models," the biz told us. "Hallucinations are generally the symptom of LLMs operating out of distribution. In other words, when prompted to generate text on topics and data that are similar to the data the LLM was trained on, LLMs tend to hallucinate significantly less than when prompted on things they’ve learned little about.

"For example, throughout our evaluation of Saul with actual lawyers, we were able to confirm that it was less prone to hallucinating when discussing specific legal concepts. In short, we expect LLMs that are specifically trained on legal data to hallucinate much less on legal topics than their generalist counterparts."

That said, the upstart cautions that AI models should not be relied on as if they're a legal database, and that double-checking the output of LLMs is advised. We would say: Checking is mandatory.

The boffins behind SauLM-7B – Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, Dominic Culver, Rui Melo, Caio Corro, Andre F. T. Martins, Fabrizio Esposito, Vera Lúcia Raposo, Sofia Morgado, and Michael Desa – describe their work in a paper titled "SaulLM-7B: A pioneering Large Language Model for Law."

Available on AI model community site HuggingFace, SauLM-7B – named after the US TV series Better Call Saul, which follows the antics of an unorthodox criminal lawyer – is based on the open source Mistral 7B model, both of which have 7 billion parameters. That's significantly less than models like LlaMA 2, which can be based on up to 70 billion parameters. But SauLM-7B's creators note that this is just the first milestone and work is being done with different model sizes.

As you'd expect from an LLM, SauLM-7B works by being asked questions or given prompts in natural language, and it attempts to answer or respond to them; in this case, it's focused on the law and legal issues.

Jonathan Schwarz, co-founder and chief scientist at UK-based legal AI startup Safe Sign Technologies, told The Register that the makers of SauLM-7B have taken a sensible approach to specializing general LLMs.

"It's a nice offering as an open source alternative to more proprietary techniques," he said. "However, there's work that needs to be done."

It's a nice offering as an open source alternative to more proprietary techniques

Schwarz pointed to the need for red-teaming models, something he said his firm is doing internally.

We're told that Safe Sign Technologies has prototyped a legal LLM and aims to have a second iteration ready for deployment through partners later this year or thereafter.

Schwarz said the company was not yet ready to comment on the extent to which its offering will be open source or proprietary. But he claimed that while SaulLM-7B-Instruct – a version fine-tuned on general and legal instructions – managed to score an average of 0.61 on the LegalBench-Instruct benchmark, "we're getting close to 0.77." That accuracy percentage is similar to GPT-4, though we urge to you to take some salt with machine-learning benchmarks.

"Our ambition here was to create an AI solution that gives every person very good quality legal advice instantly," said Alexander (Sami) Kardos-Nyheim, founder and CEO of Safe Sign Technologies in an interview with The Register. "Not unreliable legal advice from ChatGPT or anything like that. But serious legal advice you can actually use and rely on via AI."

You kind of avoid that problem of kind of learning all this toxic behavior that you're trying to undo later

"Very, very roughly, the way that these techniques are usually trained is that you have a huge data set that's collected from diverse sources, and each training step you pick a random subset of that and try to improve your performance by learning from it," explained Schwarz. "Instead of simply picking a random subset, we have new methods that at each point in training try to determine what is the optimal subset of data to train on at this point in time, such that the improvement of the models is maximal. That's the first step. This way you kind of avoid that problem of kind of learning all this toxic behavior that you're trying to undo later," Schwarz added.

Schwarz suggested that Safe Sign's approach is, well, safer. "In a case where there's a specific legal question that the model simply doesn't quite know how to answer, rather than confidently giving an incorrect answer we can simply say that we're holding back on that one."

He went on to voice skepticism about the boil-the-ocean approach taken by OpenAI and Google, which involves focusing on broad harms like racial and gender bias, and paying inexpensive contractors to rank their models' responses so they can retrain the neural networks to make fewer harmful responses.

"If you want to be able to do everything a human can do, you sort of have to test against everything a human can do," said Schwarz. "I think that's kind of just a losing strategy if you're trying to do that over all possible topics."

"Not just in legal AI, but more generally, in AI, we're not seeing the focus on safety and robustness that allows for serious, reliable systems in the medical or the legal context," added Kardos-Nyheim. ®

More about

TIP US OFF

Send us news


Other stories you might like