LLaMA drama as Meta's mega language model leaks
Don't worry... 'The good will outweigh the bad, by at least tenfold. Probably closer to 100x'
LLaMA, Meta's latest large language model, has leaked online and is available for download, despite apparent attempts to limit access for research purposes only.
The Facebook owner announced in February it was releasing the model in a limited fashion to select academics, government types, and companies to play with amid fears LLaMA could be misused. But information wants to be free, or at least certain people want it to be, and Meta's creation has found its way online anyway, starting with a torrent leak.
Sentence-predicting large language models, which generate passages of text from input prompts, have steadily evolved, from auto-completing one's writing to chatbots capable of performing tasks when asked to do so using natural language.
Experts have warned this technology could be used to automate the manufacture of large amounts of fake news, spam, phishing emails, disinformation, incitement, you name it, for years to come. Organizations building these models often keep the software under wraps, behind APIs, or release limited versions or demos.
"There is still more research that needs to be done to address the risks of bias, toxic comments, and hallucinations in large language models," Meta said last week.
"Like other models, LLaMA shares these challenges. As a foundation model, LLaMA is designed to be versatile and can be applied to many different use cases, versus a fine-tuned model that is designed for a specific task.
"To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license focused on research use cases. Access to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world."
But Meta's efforts to control access to LLaMA appear to have been in vain, or so that appears. Shortly after sharing the model with selected boffins, and those in industry and civil society, someone on 4Chan posted details on how to obtain the whole model via peer-to-peer file sharing, and eventually instructions on how to download it all were published on GitHub.
As always, exercise caution when fetching stuff like this from torrents in case someone's hidden something nefarious in there. The 65-billion-parameter model takes up about 220GB of disk space, we're told.
The copies of LLaMA available via GitHub do appear to be legit, we note. Shawn Presser, an AI engineer who wrote up the download instructions on Microsoft's code-sharing site, showed us screenshots of him successfully generating text from the model. He believes a researcher who was given access to the model from Meta leaked it, leading to its perhaps wider-than-expected distribution.
Start your conspiracy theory engines.
Presser reckons releasing the model freely with no caveats is better than just limiting it to approved academics. "I think the good will outweigh the bad, by at least tenfold. Probably closer to 100x," he told The Register.
- Cause for a LLaMA? Meta reckons its smaller text-emitting AI is better than rivals
- 'Major' news: Microsoft slips Bing chatbot shortcut into Windows 11
- Salesforce latest to sprinkle ChatGPT on itself, will ask language models to write code
- Now Microsoft injects Copilot AI into Dynamics 365
Training and running state-of-the-art large language models is expensive, generally speaking; only organizations that have access to piles of GPUs and other infrastructure are in a position to build, tweak, and test them. AI researchers at Meta built LLaMA to be smaller, making it more compact than today's commercial models and thus more accessible to academics and developers without non-trivial IT budgets.
Meta's machine-learning gurus claimed their system outperformed OpenAI's GPT-3 and is as good as other large language models, such as Google's 540-billion-parameter PaLM or DeepMind's 70-billion-parameter Chinchilla. The smaller size means it should be easier to use for scientists who have less computational resources. And yes, there are a plethora of language models out there of all shapes and sizes; it's more than just OpenAI and Facebook.
LLaMA still requires hundreds of gigabytes of storage and a decent amount of compute to drive it. Getting the model up and running also isn't straight forward, unless you're used to handling systems of this kind, and repurposing it for more nefarious activities will also require further technical expertise. Despite the model being leaked, Meta said it will continue to share LLaMA with selected researchers only.
We believe the current release strategy allows us to balance responsibility and openness
"It's Meta's goal to share state-of-the-art AI models with members of the research community to help us evaluate and improve those models," a spokesperson told The Register.
"LLaMA was shared for research purposes, consistent with how we have shared previous large language models. While the model is not accessible to all, and some have tried to circumvent the approval process, we believe the current release strategy allows us to balance responsibility and openness."
In other words, the Facebook group stands by its approach to distribute its tech.
Meta's recent attempts to release large language models haven't gone smoothly. Last year its chatty BlenderBot was criticized for spreading misinformation and anti-Semitic views. Galactica, designed to summarize scientific knowledge, was removed three days after it was launched for generating fake and racist content. ®