Stanford Internet Observatory raises alarm over 'serious failings with the child protection systems at Twitter'

Researchers find 100,000 accounts spamming child abuse material

Updated Twitter failed to take down numerous images of child sexual abuse material posted on its platform over two months, researchers at the Stanford Internet Observatory will allege in an upcoming report.

The academic group said it was looking at the wider issue of child exploitation online when it "discovered serious failings with the child protection systems at Twitter."

Using the Elon Musk-owned social media company's API, researchers studied the metadata in tweets and parsed the URLs of images to scan for pictures depicting child sexual abuse material (CSAM), employing a tool called PhotoDNA developed by Microsoft.

Algorithms converted suspected CSAM images into hash values, which were then compared against a known database stored by the National Center for Missing & Exploited Children (NCMEC) representing known illegal images of minors. A hash value match suggests that the photo posted on Twitter matches a known CSAM image from the database. 

Hash values allow those called on to analyze CSAM to do so without having to actually view or unlawfully store the images.

David Thiel, chief technologist at the Stanford Internet Observatory and co-author of the report, told The Wall Street Journal researchers detected 40 hash matches in 100,000 or so tweets from March 12 to May 20. The academic group said it reported the issue to the NCMEC in April, but the images weren't removed until May 20.

"Having no remaining Trust and Safety contacts at Twitter, we approached a third-party intermediary to arrange a briefing. Twitter was informed of the problem, and the issue appears to have been resolved as of May 20," the Observatory stated on Twitter. 

"Twitter is by no means the only platform dealing with CSAM, nor are they the primary focus of our upcoming report. Regardless, we're glad to have contributed to improving child safety on Twitter, and thank them for their help in remediating this issue," the org added.

Under Elon Musk's ownership, Twitter increased prices to access its API. A leaked slide deck previously suggested prices starting at $42,000 per month.

Stanford Internet Observatory's Thiel criticized Twitter for raising prices, and said doing so makes it more difficult for academics to study the platform and hold it accountable.

The Observatory relied on Twitter's API for its investigation, and had previously negotiated a deal to use the tool for free – but has stopped using the enterprise-level tier as of last week due to cost concerns, Thiel said.

A spokesperson from the Stanford Internet Observatory told The Register: "It is no longer possible to conduct a study like this using Twitter's API. The level of access needed for academic studies is no longer affordable and researchers are required to delete the data they previously collected under academic data access agreements."

In March 2023, Twitter's developer team stated it was working to try and help academics, who need to make a high volume of API requests for their research, access its software. They also said that academics could use its free or basic tiers. The spokesperson, however, said Twitter has not come up with an alternative solution for researchers and the lowest tier package costing $42,000 per month offers less data than was previously available to some academics for free. 

Meanwhile, the free tier reportedly provides "read-only" access, and doesn't have the flexibility required to support academic research.

Twitter did not respond to The Register's request for comment. Indeed, the press@twitter.com email address still auto-responds with nothing but a poop emoji. ®

Updated to add

David Thiel of the Stanford Internet Observatory got back to us with some thoughts we can share here:

This sample of tweets was part of a wider investigation on the dynamics of child sexual exploitation across social media platforms. The tweets were from monitoring a stream of metadata matching keywords and hashtags that might indicate CSAM distribution.

This of course makes it a higher rate than one might expect from a representative sample, but we did not expect any known CSAM to be detected at all. Online platforms almost always automatically identify and remove known CSAM based on PhotoDNA matches or other hash sets – so detecting PhotoDNA matches on a public surface is a surprising finding, indicating broken CSAM detection at the company itself.

Whether it was a problem pre-Musk or any potential change in scale is outside of the scope of what we were able to study, but Twitter was presumably at one point using PhotoDNA correctly. It's unclear when that particular problem arose.

More about

TIP US OFF

Send us news


Other stories you might like