Trail of Bits security peeps emit tool to weaponize Python's insecure pickle files to hopefully now get everyone's attention

Alternatively: Python's pickle pilloried with prudent premonition of poisoning


Evan Sultanik, principal computer security researcher with Trail of Bits, has unpacked the Python world’s pickle data format and found it distasteful.

He is not the first to do so, and acknowledges as much, noting in a recent blog post that the computer security community developed a disinclination for pickling – a binary protocol for serializing and deserializing Python object structures – several years ago.

Even Python's own documentation on the pickle module admits that security is not included. It begins, "Warning: The pickle module is not secure. Only unpickle data you trust," and goes on from there.

Yet developers still use it, particularly in the Python machine learning (ML) community. Sultanik says it's easy to understand why, because pickling is built into Python and because it saves memory, simplifies model training, and makes trained ML models portable.

In addition to being part of the Python standard library, pickling is supported in Python libraries NumPy and scikit-learn, both of which are commonly used in AI-oriented data science.

According to Sultanik, ML practitioners prefer to share pre-trained pickled models rather than the data and algorithms used to train them, which can represent valuable intellectual property. Websites like PyTorch Hub have been set up to facilitate model distribution and some ML libraries incorporate APIs to automatically fetch models from GitHub.

Almost a month ago in the PyTorch repo on GitHub, a developer who goes by the name KOLANICH opened an issue that states the problem bluntly: "Pickle is a security issue that can be used to hide backdoors. Unfortunately lots of projects keep using [the pickling methods] torch.save and torch.load."

Other developers participating in the discussion responded that there's already a warning and pondered what's to be done.

Hoping to light a fire under the pickle apologists, Sultanik, with colleagues Sonya Schriner, Sina Pilehchiha, Jim Miller, Suha S. Hussain, Carson Harmon, Josselin Feist, and Trent Brunson, developed a tool called Fickling to assist with reverse engineering, testing, and weaponizing pickle files. He hopes security engineers will use it for examining pickle files and that ML practitioners will use it to understand the risks of pickling.

Sultanik and associates also developed a proof-of-concept exploit based on the official PyTorch tutorial that can inject malicious code into an existing PyTorch model. The PoC, when loaded as a model in PyTorch, will exfiltrate all the files in the current directory to a remote server.

"This is concerning for services like Microsoft’s Azure ML, which supports running user-supplied models in their cloud instances," explains Sultanik. "A malicious, 'Fickled' model could cause a denial of service, and/or achieve remote code execution in an environment that Microsoft likely assumed would be proprietary."

Sultanik said he reported his concerns to the maintainers of PyTorch and PyTorch Hub and apparently was told they'll think about adding additional warnings. And though he was informed models submitted to PyTorch Hub are "vetted for quality and utility," he observed that there's no effort to understand the people publishing models or to audit the code they upload.

Asking users to determine on their own whether code is trustworthy, Sultanik argues, is no longer sufficient given the supply chain attacks that have subverted code packages in PyPI, npm, RubyGems, and other package registries.

"Moving away from pickling as a form of data serialization is relatively straightforward for most frameworks and is an easy win for security," he concludes. ®

Broader topics


Other stories you might like

  • Stolen university credentials up for sale by Russian crooks, FBI warns
    Forget dark-web souks, thousands of these are already being traded on public bazaars

    Russian crooks are selling network credentials and virtual private network access for a "multitude" of US universities and colleges on criminal marketplaces, according to the FBI.

    According to a warning issued on Thursday, these stolen credentials sell for thousands of dollars on both dark web and public internet forums, and could lead to subsequent cyberattacks against individual employees or the schools themselves.

    "The exposure of usernames and passwords can lead to brute force credential stuffing computer network attacks, whereby attackers attempt logins across various internet sites or exploit them for subsequent cyber attacks as criminal actors take advantage of users recycling the same credentials across multiple accounts, internet sites, and services," the Feds' alert [PDF] said.

    Continue reading
  • Big Tech loves talking up privacy – while trying to kill privacy legislation
    Study claims Amazon, Apple, Google, Meta, Microsoft work to derail data rules

    Amazon, Apple, Google, Meta, and Microsoft often support privacy in public statements, but behind the scenes they've been working through some common organizations to weaken or kill privacy legislation in US states.

    That's according to a report this week from news non-profit The Markup, which said the corporations hire lobbyists from the same few groups and law firms to defang or drown state privacy bills.

    The report examined 31 states when state legislatures were considering privacy legislation and identified 445 lobbyists and lobbying firms working on behalf of Amazon, Apple, Google, Meta, and Microsoft, along with industry groups like TechNet and the State Privacy and Security Coalition.

    Continue reading
  • SEC probes Musk for not properly disclosing Twitter stake
    Meanwhile, social network's board rejects resignation of one its directors

    America's financial watchdog is investigating whether Elon Musk adequately disclosed his purchase of Twitter shares last month, just as his bid to take over the social media company hangs in the balance. 

    A letter [PDF] from the SEC addressed to the tech billionaire said he "[did] not appear" to have filed the proper form detailing his 9.2 percent stake in Twitter "required 10 days from the date of acquisition," and asked him to provide more information. Musk's shares made him one of Twitter's largest shareholders. The letter is dated April 4, and was shared this week by the regulator.

    Musk quickly moved to try and buy the whole company outright in a deal initially worth over $44 billion. Musk sold a chunk of his shares in Tesla worth $8.4 billion and bagged another $7.14 billion from investors to help finance the $21 billion he promised to put forward for the deal. The remaining $25.5 billion bill was secured via debt financing by Morgan Stanley, Bank of America, Barclays, and others. But the takeover is not going smoothly.

    Continue reading

Biting the hand that feeds IT © 1998–2022