PyPI subpoenaed: US govt demands data on developers
Python package packhouse ponders privacy position
In March and April, three subpoenas seeking data on users of PyPI, the Python Package Index, were presented to the Python Software Foundation (PSF).
PyPI is a repository for distributing third-party Python software packages – sets of files that provide Python developers with specific functionality. The subpoenas – legal demands for information – came from the US Department of Justice, said Ee Durbin, director of infrastructure at the PSF, in a blog post on Wednesday.
"The PSF was not provided with context on the legal circumstances surrounding these subpoenas," said Durbin. "In total, user data related to five PyPI usernames were requested."
The Feds asked for names associated with the identified accounts, addresses (including mailing, email, residential and business), connection records, records of session times and associated network identifiers, account creation dates, telephone numbers and IP address' used during registration, payment information, Python packages uploaded, and IP address download logs of any PyPI packages uploaded by the identified users.
Durbin insists that while the PSF remains committed to protecting user data from disclosure whenever possible, it was not possible in this instance. "PSF determined with the advice of counsel that our only course of action was to provide the requested data," said Durbin, who delivered the requested information to the government.
Some of the data sought could not be provided. For example, PyPI does not keep the IP addresses of those who download packages as GeoIP information is held by PyPI's CDN provider. Likewise, PyPi is a free service and thus does not have any billing or payment data for account holders.
The subpoenas have prompted the PSF to reevaluate PyPI's already limited data retention and disclosure policies in an effort to balance legal obligations with the organization's stated interest in protecting user privacy. Durbin said these policies will be made explicit to the Python community and will cover how future government data demands are handled, as well as the breadth of data stored and the retention period.
"Though we collect very little personal data from PyPI users, any unnecessarily held data are still subject to these kinds of requests in addition to the baseline risk of data compromise via malice or operator error," said Durbin.
- Python Package Index had one person on-call to hold back weekend malware rush
- Worried about the security of your code's dependencies? Try Google's Deps.dev
- Python head hisses at looming Euro cybersecurity rules
- Frankenstein malware stitched together from code of others disguised as PyPI package
Large internet service providers commonly receive legal demands for information about the activities of their users. And those that receive a sufficient volume of requests often publish transparency reports that outline – to the extent allowed by law – the number and nature of legal demands.
In its latest 2022 Transparency Report, Microsoft's GitHub said it had received 432 requests to disclose user information, up from 335 in 2021. Of those, 274 were subpoenas, with 265 from criminal investigations or government agencies and the other 9 following from civil disputes; 97 were court orders; and 22 were search warrants.
Efforts to slip subverted software into online package registries to facilitate supply chain attacks have increased in recent years and PyPI has seen its share of suspect activity. Last August, the Python-focused service warned for the first time of a phishing attack targeting account holders. Since then there have been numerous PyPI incidents reported by security researchers, such as the WASP malware, a fake SentinelOne SDK, a poisoned PyTorch dependency, and a remote access tool dubbed Colour-Blind.
Earlier this week, Durbin told us that PyPI – which had just three admins reviewing security reports from the community – plans to hire a dedicated security engineer and that the PSF was about to make an offer for a security-developer-in-residence.
The Register asked Durbin: whether any of the usernames identified were associated with accounts flagged malware; whether this is the first legal request PyPI has received; and whether PyPI or the PSF plans to start issuing transparency reports of government requests in the future, if there are enough info demands to justify doing so.
We've not heard back as yet. ®