Computer boffins Juan Echeverria and Shi Zhou at University College London have chanced across a dormant Twitter botnet made up of more than 350,000 accounts with a fondness for quoting Star Wars novels.
Twitter bots have been accused of warping the tone of the 2016 election. They also can be used for entertainment, marketing, spamming, manipulating Twitter's trending topics list and public opinion, trolling, fake followers, malware distribution, and data set pollution, among other things.
In a recently published research paper, the two computer scientists recount how a random sampling of 1 per cent of English-speaking Twitter accounts – about 6 million accounts – led to their discovery.
Pursuing an unrelated inquiry, the researchers were examining the geographic distribution of 20 million tweets with location tags in the dataset of 843 million tweets from the account sample, and they noticed an unusual distribution pattern.
Some accounts followed the expected distribution pattern, which coincides with population centers in America and Europe. But another set of accounts showed random distribution within those areas, often resulting in tweets from unlikely places such as seas, deserts, and the Arctic.
Blue dots at edge of box over Europe, barely visible after image compression, show Star Wars bots
When the researchers manually examined the text of these tweets, they found the majority of them consisted of random excerpts from Star Wars novels, and that many of them started or ended with an incomplete word or included a randomly placed hashtag.
Luke's answer was to put on an extra burst of speed. There were only ten meters #separating them now. If he could cover t
"This quote was from the book Star Wars: Choices of One, where Luke Skywalker is an important character," the paper explains. "We have found quotations from at least 11 Star Wars novels."
The manual examination of data associated with 4,942 accounts resulted in the identification of 3,244 bots with consistent characteristics:
- Tweets only random Star Wars quotes.
- Uses hashtags associated with follower acquisition or prepended to random words.
- Never retweets or mentions other Twitter users.
- Each bot has made only 11 or fewer tweets since its inception.
- Each bot has between 10 and 31 friends.
- The bots choose only "Twitter for Windows Phone" as their source application.
- The bots' user ID numbers fall into a narrow range between 1.5 × 10^9 and 1.6 × 10^9.
Given that set of bots, the researchers created a machine learning classifier to hunt for other accounts with similar characteristics. The algorithm identified 356,957 Star Wars bots.
The researchers say they were lucky to have spotted the bots, which appear to have been designed to thwart automated detection methods. They note that being human helped make the discovery possible.
"The fact that the bots tagged their tweets with random locations in North America and Europe was a [deliberate] effort to make their tweets look more real," the paper explains. "But this camouflage trick backfired – the faked locations when plotted on a map seemed completely abnormal. It's important to note that this anomaly could only be noticed by a human looking at the map, whereas a computer algorithm would have a hard time to realize the anomaly."
Curiously, the Star Wars bots have been silent since 2013. The researchers observe that pre-aged bots can be sold for more than newly created bots on the black market, presumably because bot detection methods consider older accounts more likely to be reputable.
Twitter declined to comment on the findings, which may be because the company was unaware of them until now.
"We have not reported the accounts directly to Twitter (yet)," said Echeverria in an email to The Register. "We are waiting for the paper to be approved by the scientific journal to which it was submitted. We would also like to give researchers a chance to get the dataset by themselves before they are gone, this is why we have not reported to Twitter directly, but we will as soon as the paper gets accepted."
Inspired by their success identifying the Star Wars botnet, Echeverria, a research student, and his faculty advisor, senior lecturer Shi Zhou, claim to have identified an even larger botnet numbering half a million accounts.
"The larger botnet is part of a subsequent research paper, which is also under review," Echeverria said. "As soon as it gets approved, I will be able to disclose more information about it."
Echeverria added that there's now a Twitter account named "@thatisabot" to make it easier for people to report bots to researchers.
"Think of it as @spam but for researchers instead of Twitter," he said. "Furthermore, we have a webpage, www.thatisabot.com, which will (soon) also allow people to report bots to researchers."
"Commander, tear this ship apart until you've found those plans and bring me the Ambassador. I want her alive!" ®