Trail of Bits security peeps emit tool to weaponize Python's insecure pickle files to hopefully now get everyone's attention

Alternatively: Python's pickle pilloried with prudent premonition of poisoning


Evan Sultanik, principal computer security researcher with Trail of Bits, has unpacked the Python world’s pickle data format and found it distasteful.

He is not the first to do so, and acknowledges as much, noting in a recent blog post that the computer security community developed a disinclination for pickling – a binary protocol for serializing and deserializing Python object structures – several years ago.

Even Python's own documentation on the pickle module admits that security is not included. It begins, "Warning: The pickle module is not secure. Only unpickle data you trust," and goes on from there.

Yet developers still use it, particularly in the Python machine learning (ML) community. Sultanik says it's easy to understand why, because pickling is built into Python and because it saves memory, simplifies model training, and makes trained ML models portable.

In addition to being part of the Python standard library, pickling is supported in Python libraries NumPy and scikit-learn, both of which are commonly used in AI-oriented data science.

According to Sultanik, ML practitioners prefer to share pre-trained pickled models rather than the data and algorithms used to train them, which can represent valuable intellectual property. Websites like PyTorch Hub have been set up to facilitate model distribution and some ML libraries incorporate APIs to automatically fetch models from GitHub.

Almost a month ago in the PyTorch repo on GitHub, a developer who goes by the name KOLANICH opened an issue that states the problem bluntly: "Pickle is a security issue that can be used to hide backdoors. Unfortunately lots of projects keep using [the pickling methods] torch.save and torch.load."

Other developers participating in the discussion responded that there's already a warning and pondered what's to be done.

Hoping to light a fire under the pickle apologists, Sultanik, with colleagues Sonya Schriner, Sina Pilehchiha, Jim Miller, Suha S. Hussain, Carson Harmon, Josselin Feist, and Trent Brunson, developed a tool called Fickling to assist with reverse engineering, testing, and weaponizing pickle files. He hopes security engineers will use it for examining pickle files and that ML practitioners will use it to understand the risks of pickling.

Sultanik and associates also developed a proof-of-concept exploit based on the official PyTorch tutorial that can inject malicious code into an existing PyTorch model. The PoC, when loaded as a model in PyTorch, will exfiltrate all the files in the current directory to a remote server.

"This is concerning for services like Microsoft’s Azure ML, which supports running user-supplied models in their cloud instances," explains Sultanik. "A malicious, 'Fickled' model could cause a denial of service, and/or achieve remote code execution in an environment that Microsoft likely assumed would be proprietary."

Sultanik said he reported his concerns to the maintainers of PyTorch and PyTorch Hub and apparently was told they'll think about adding additional warnings. And though he was informed models submitted to PyTorch Hub are "vetted for quality and utility," he observed that there's no effort to understand the people publishing models or to audit the code they upload.

Asking users to determine on their own whether code is trustworthy, Sultanik argues, is no longer sufficient given the supply chain attacks that have subverted code packages in PyPI, npm, RubyGems, and other package registries.

"Moving away from pickling as a form of data serialization is relatively straightforward for most frameworks and is an easy win for security," he concludes. ®

Similar topics


Other stories you might like

  • Prisons transcribe private phone calls with inmates using speech-to-text AI

    Plus: A drug designed by machine learning algorithms to treat liver disease reaches human clinical trials and more

    In brief Prisons around the US are installing AI speech-to-text models to automatically transcribe conversations with inmates during their phone calls.

    A series of contracts and emails from eight different states revealed how Verus, an AI application developed by LEO Technologies and based on a speech-to-text system offered by Amazon, was used to eavesdrop on prisoners’ phone calls.

    In a sales pitch, LEO’s CEO James Sexton told officials working for a jail in Cook County, Illinois, that one of its customers in Calhoun County, Alabama, uses the software to protect prisons from getting sued, according to an investigation by the Thomson Reuters Foundation.

    Continue reading
  • Battlefield 2042: Please don't be the death knell of the franchise, please don't be the death knell of the franchise

    Another terrible launch, but DICE is already working on improvements

    The RPG Greetings, traveller, and welcome back to The Register Plays Games, our monthly gaming column. Since the last edition on New World, we hit level cap and the "endgame". Around this time, item duping exploits became rife and every attempt Amazon Games made to fix it just broke something else. The post-level 60 "watermark" system for gear drops is also infuriating and tedious, but not something we were able to address in the column. So bear these things in mind if you were ever tempted. On that note, it's time to look at another newly released shit show – Battlefield 2042.

    I wanted to love Battlefield 2042, I really did. After the bum note of the first-person shooter (FPS) franchise's return to Second World War theatres with Battlefield V (2018), I stupidly assumed the next entry from EA-owned Swedish developer DICE would be a return to form. I was wrong.

    The multiplayer military FPS market is dominated by two forces: Activision's Call of Duty (COD) series and EA's Battlefield. Fans of each franchise are loyal to the point of zealotry with little crossover between player bases. Here's where I stand: COD jumped the shark with Modern Warfare 2 in 2009. It's flip-flopped from WW2 to present-day combat and back again, tried sci-fi, and even the Battle Royale trend with the free-to-play Call of Duty: Warzone (2020), which has been thoroughly ruined by hackers and developer inaction.

    Continue reading
  • American diplomats' iPhones reportedly compromised by NSO Group intrusion software

    Reuters claims nine State Department employees outside the US had their devices hacked

    The Apple iPhones of at least nine US State Department officials were compromised by an unidentified entity using NSO Group's Pegasus spyware, according to a report published Friday by Reuters.

    NSO Group in an email to The Register said it has blocked an unnamed customers' access to its system upon receiving an inquiry about the incident but has yet to confirm whether its software was involved.

    "Once the inquiry was received, and before any investigation under our compliance policy, we have decided to immediately terminate relevant customers’ access to the system, due to the severity of the allegations," an NSO spokesperson told The Register in an email. "To this point, we haven’t received any information nor the phone numbers, nor any indication that NSO’s tools were used in this case."

    Continue reading

Biting the hand that feeds IT © 1998–2021