Researchers find Meta's withdrawal of misinformation tool hard to swallow
If the new system is so good, why not onboard everyone who accessed the old system?
Feature While Meta faces formal proceedings from the European Commission, academics and other researchers have criticized its provision for monitoring misinformation on its social media platforms.
With the UK and US gearing up for general elections – and with the EU elections just passed – critics argue that the company, which owns Facebook, WhatsApp, and Instagram, is failing to offer a viable alternative to its political misinformation monitoring tool, CrowdTangle, which closes in August.
In April, the European Commission took action against Meta using the recently introduced Digital Services Act. As well as claiming a failure to properly monitor distribution of political misinformation by "foreign actors" before June's European elections, commissioners said Meta's decision to "deprecate" CrowdTangle could contravene the Act.
Meta has proposed replacing CrowdTangle with Meta Content Library (MCL) and Content Library API to "provide comprehensive access to the full public content archive from Facebook and Instagram."
However, researchers told The Register they feared the new tool was not fit for purpose.
Kalina Bontcheva, a professor at the University of Sheffield, England, has used natural language processing to study misinformation on social media platforms. As part of that work she has created tools that help journalists and NGOs import and analyze data from CrowdTangle.
Because CrowdTangle allows users to download data in the form of a CSV file, researchers can provide proof to support their studies, she said. "Science needs to be repeatable, which essentially means that we have to be able to prove our findings. This could be for other researchers, but also somebody questioning the validity of our findings."
However, Meta's replacement for CrowdTangle is an online environment that does not allow users to download data, she said.
"The new system is a non-starter for us because there's no repeatability there. There's no evidence. We're not allowed to download data, which is the ultimate proof that what we claim in our papers is true."
She also said MCL periodically wipes data, meaning researchers cannot go back in time to prove earlier findings.
Another barrier to MCL replacing CrowdTangle is the fact that the social media giant insists that users sign a $1 million liability clause in the contract to mitigate against risks Meta may face in others using the tool.
While journalists will no longer be allowed to use the misinformation services, Meta has argued that anyone affiliated with a non-profit entity that holds scientific or public interest research as a primary purpose can apply for access.
However, Bontcheva argued that the vetting process would make it impractical for NGOs to sign up.
For example, in September, Brussels-based NGO EU DisinfoLab exposed a Russia-based influence operation Doppelganger, which uses multiple copies of authentic media outlets, including Bild, 20minutes, Ansa, The Guardian, or RBC Ukraine and targets users with fake articles, videos, and polls designed to undermine support for Ukraine following Russia's invasion of the Eastern European nation.
EU DisinfoLab used CrowdTangle to gather the data for the Doppelganger investigation and its other research.
Alexandre Alaphilippe, executive director, told The Register: "When you are investigating these kind of campaigns, the main question is not that much about the content, but how it is being distributed to millions of people. Of course, the platforms play a crucial role.
"For a very long time, CrowdTangle has been the tool the [research] community has been using to look at how some content was shared on Facebook, how the same kind of content was shared between different URLs, or differences between groups … to be able to understand coordinated behavior for the same kind of content being distributed across multiple groups at the same time. It's been great to actually do that kind of research to show who was behind a campaign."
Alaphilippe said that although the research community was aware that Meta Content Library was replacing CrowdTangle, it had yet to be given access to the tool, so researchers remained skeptical.
"It might be a fantastic tool, but it is quite obscure right now with what's inside it. If the Content Library is such a great tool, it's a bit exceptional that the people who were granted access to CrowdTangle don't have the capacity to be automatically onboarded on it. That is not reinforcing trust between Meta and the research community."
Meanwhile, researchers have noticed that the performance of CrowdTangle has been depreciating for the past three years, while the team inside Meta responsible has gradually been reassigned to other tasks, frustrating researchers and journalists, he said.
Meta said CrowdTangle's performance and design were behind its decision to downgrade and discontinue the tool. The tech giant said MCL would allow users to download a subset of publicly accessible Facebook and Instagram content posted by widely known figures and organizations. It said the data was accessible in a downloadable CSV format through the user interface.
It also said that not all public content was supported in the MCL's User Interface, but it has comprehensive data from a wider range of accounts and profiles than CrowdTangle.
Speaking at an MIT Technology event [16.40], Nick Clegg, Meta's president of global affairs, said CrowdTangle was a degrading tool.
- US Surgeon General wants cigarette-style health warning labels on social networks
- Meta accused of trying to discredit ad researchers
- Meta won't train AI on Euro posts after all, as watchdogs put their paws down
- Meta will use your social media posts to train its AI. Europe gets an opt out
"It just doesn't give you any complete and accurate insight into what's going on Facebook. It's not integrated in our systems [which are] evolving all the time. It was built for a wholly different purpose," he said. The former deputy prime minister of the UK argued that CrowdTangle only measured a "narrow cake slice" of social media engagement and doesn't tell researchers what platform users are seeing online.
He said Meta had worked with researchers to build MCL, which "provides a far more comprehensive and complete view about what people actually see and experience" on Meta's platforms.
Meta brings in revenue of around $140 billion, with a market cap of more than $1 trillion. It bought CrowdTangle in 2016 for an undisclosed sum.
Allaphilipe said that if CrowdTangle's functionality was failing, that was Meta's choice. "When I saw some comments about how it was not working or was not reflecting what was in the platform, that's not somebody else's fault," he said.
Meanwhile, the research community was concerned about the quality of data Meta had included in publicly available dashboards during the recent European elections.
In May, The Register reported that Joe Biden spends more on Facebook and Instagram ads than Donald Trump, but ads attacking the US president outnumber those attacking his likely rival in this year's presidential election. The team led by Professor Jennifer Stromer-Galley, senior associate dean at Syracuse University's School of Information Studies, used a Neo4j database – with $250,000 backing from the vendor – to store and analyze the data, collected using Crowdtangle and the Facebook and Instagram ad library API.
Stromer-Galley told The Register the closure of CrowdTangle would still potentially close a data pipeline for their research as the team has no other way to search posts on public pages and accounts with more than 10,000 followers.
Graph database shows Biden outspends Trump in social media ad war
READ MOREHer meetings with Meta had provided no good answer for why Crowdtangle is going away, she said.
Another downstream problem created by Crowdtangle's retirement was the in the processing of data. While Meta offers a Jupyter Notebooks environment to write Python or R scripts, run analysis and download the results, Stromer-Galley has built a long-term project to understand message content. It creates classifiers and tags the data using Google's natural language model, BERT.
"The problem from my work is that because I'm studying the content of the messages, I've got these classifiers that I've built, there is no way for me to add them in the current [Meta] environment," she said. "I can't tag any of the messages with my classifiers because it requires me to pull those messages and run them through the pre-trained BERT model to then tag them. That's not an infrastructure that is supported in this environment."
On April 30, the European Commission asked Meta to communicate within five working days which remedial actions it had taken to allay its concerns, including the deprecation of CrowdTangle.
The executive of the European Union told The Register it had noted Meta's deployment of 27 new real-time visual dashboards in CrowdTangle ahead of and during the European Parliament elections that took place over June 6-9.
The Commission has also requested Meta to communicate which remedial actions it intended to take to alleviate the risks related to the deprecation of CrowdTangle.
"The Commission will monitor the effective rollout of these functionalities and will continue working with Meta towards more permanent solutions that meet all its concerns as set out in the opening decision. The formal proceedings against Meta remain open," a spokesperson said. ®