This has not been a good year for opinion pollsters, most of whom failed to predict either Britain’s vote to leave the European Union or the election of Donald Trump as US president.
To be fair, both were close races. Wise pollsters offer probabilities rather than certainties: FiveThirtyEight’s Nate Silver, who successfully predicted the results in all 50 states for the 2012 presidential election, gave Trump a 29 per cent chance of winning this time around compared with an eight per cent chance of Mitt Romney beating Barack Obama. Even so, his model still made Hillary Clinton the more likely winner.
But Loughborough University’s Emotive model put Trump consistently ahead of Clinton in the three weeks before the election. Emotive doesn’t use polls – it tracks "emotions" in social media.
The university developed Emotive in 2013 to monitor the potential for civil unrest following 2011’s riots, with funding from the Ministry of Defence’s Science and Technology Laboratory and the Engineering and Physical Sciences Research Council. It used Emotive to predict the narrow Conservative win in 2015’s general election.
Emotive differs from other social media trackers in its use of linguistic analysis. It looks for emotion-laden phrases as well as single words, so it can account for the different emotions in “I feel cross” and “I cross the road”, and parts of words such as the suffix ‘-phobic’.
“If you like, it’s a map of words,” says Martin Sykora, a lecturer in information management at Loughborough University. “We can analyse the meaning behind messages a little bit more meaningfully than with purely statistical-based approaches.”
Emotive has linguistic models for British, American and Canadian English to cope with each country’s slang.
For the 2016 presidential election, Loughborough used Twitter’s standard API to find tweets that included hashtags for the candidates. (It only uses public tweets and the researchers anonymise any they use individually.) Emotive produced a near real-time online projection of which side was winning based on the stability of volumes of emotion – with smooth, consistent levels over several hours or days beating spikes and falls. And while the volume of emotions tweeted on Mr Trump remained fairly smooth, emotional volumes on Mrs Clinton were increasingly choppy.
In the Emotive system, smooth is good and choppy bad.
“We try to measure the uncertainty around these basic emotions,” says Sykora. “We found that for the general election last year in the UK that worked quite well, in terms of understanding how much uncertainty there is around how people feel about a candidate.”
Although Emotive tracks specific emotions, these were not used in tracking who was ahead. “It’s the resonance an emotion has” rather than which emotion is engaged, says Sykora. “Anger can be positive and negative, so we don’t really care about the direction of it. But if it changes a lot, that means something.”
In the three weeks before the election, Mrs Clinton was only ahead twice for brief periods. “I kept thinking, Donald Trump seems to be in the lead based on this,” Sykora recalls. “I couldn’t quite believe it myself, but it kept telling us this.”
Different words and phrases carry varying weights: the system does differentiate emotions by perceived strength - for instance the emotion of fear could be expressed with “uneasy,” “fearful,” “petrified” where each expression carries an increasingly higher emotional score.
According to the academics, their system was worked out from numerous corpora, existing lexicons and large social media datasets with the Emotive team lead by a linguist with a PhD in English and experience in linguistics and discourse analysis. This was subsequently validated on annotated data - a standard process in the information retrieval field.
The method is not foolproof.
Sykora and other academics analysed tweets from Paris during and immediately after last year’s terrorist attacks, using machine translation from French to English. With geolocation, they identified a significant cluster of tweets expressing sadness made around the Bataclan theatre.
Someone tweeting about TV programme Desperate Housewives generated a false positive cluster in south-west Paris – but the writers argued that if refined this kind of model could spot areas suffering strong emotional reactions after a disaster.
Also, there are bots. How did the Loughborough team navigate these?
Tom Jackson, Loughborough’s professor of information and knowledge management, said they conducted a qualitative analysis of the tweets and found that especially the highly emotional messages were certainly genuine.” In fact the Trump bot wasn’t very emotive so wasn’t picked up by Emotive for the majority of cases. However, the dilemma we had was if a bot tweet received many retweets, should it be removed or kept as people had bought into it?”
Does social media, therefore, really provide a clearer view of what people feel than opinion polls? “The wealthy have always had good communication channels that can influence people and policy,” says Jackson.
“The age of digital communication has provided a communication channel to the not so well, if at all connected, working class. Pollsters could be seen as part of the establishment and that is why when they ask people who they might vote for they might receive a reply that doesn’t reflect who they will vote for.”
It certainly has far greater reach than any opinion poll could. Pew research on social media usage reports that 24 per cent of online American adults (and 21 per cent of all adults) use Twitter. Usage is skewed towards the young and better educated, with just 10 per cent of over-65s and 20 per cent of those without degrees tweeting, although the service is used by similar proportions of urban, suburban and rural internet users.
Facebook is used by 68 per cent of American adults with consistently high figures across demographic groups, suggesting its data could provide even greater insights. However, Loughborough has generally relied on Twitter, where almost all tweets are public and intended for general consumption, rather than Facebook where a large proportion of posts are limited to contacts.
Could voters decide that social media giants are as much a part of the establishment as opinion pollsters, and start ignoring or lying to them too? Jackson thinks not: “As it’s not a snapshot in time, research has shown that it is near impossible to hold onto a façade over a long period of time.”
However, he can see another potential problem: “The new way of manipulating would be the introduction of emotional bots that could change the dynamics and mood of a nation, which is a massive discussion in its own right.”
Is Emotive, therefore, just this election’s Nate Silver? Correct for that time but fated to be overtaken by changing trends?
Sykora is “relatively confident” Emotive will work again, and again but concedes of the need for “numerous methodological improvements” to the model. These changes might be around the need to map a population with “demographically representative” samples of social media users and the need to account for and understanding local nuances in different election processes – such as the fact the US employs electoral college and other countries do not.
Time, and elections, will tell if he – and they – are literally right. ®