Boffins from the Netherlands and France claim that the word choices and sentence construction in President Donald Trump's tweets can be used more often than not for lie detection.
In a paper distributed through ArXiv earlier this month, researchers Sophie van der Zee, Ronald Poppe, Alice Havrileck, and Aurelien Baillon – from Erasmus University, Utrecht University, and École Normale Supérieure de Cachan – describe how they found significant linguistic differences between factually accurate and inaccurate Trump tweets, and used this finding to construct a language-based lie detection model.
The accuracy of their model was about 73 per cent, making it better than a coin-toss, but far from foolproof in its evaluation.
For their data set, the researchers used a set of tweets from President Trump that had been fact checked by the Washington Post and could be characterized either as accurate or not. They began with a data set of 605 presidential tweets from the Twitter account @realDonaldTrump between February and April 2018. They then winnowed that down by removing retweets and web links. The result was a data set of 447 tweets.
Of these, almost 30 per cent were deemed factually incorrect by fact checkers.
Using a statistical technique known as multivariate analysis of variance (MANOVA), the researchers evaluated the language of the tweets to see whether their model's characterization reflected the established accuracy or inaccuracy of the statements.
Their hypothesis, that veracity shows up in language, was supported by their findings. They detected linguistic differences between accurate and inaccurate tweets and use these differences to classify the tweets correctly as true or false about 73 per cent of the time.
Trump’s new ZTE tweet trumps old ZTE tweets that trumped his first ZTE tweetREAD MORE
The researchers assume that language does not differ with mistakes, because mistakes represent unintentional inaccuracies. Rather, they say, it's lies that distort language.
"Being wrong should not affect language use because there is no difference in the perception or intention of the sender," the paper explains. "In contrast, when deliberately presenting false statements as truths, one would expect a change in language use, according to the deception hypothesis. Lying can cause behavioral change because it is cognitively demanding, elicits emotions, and increases attempted behavioral control."
Applying this model to a second dataset of 464 tweets (about 22 per cent of which were deemed factually inaccurate) covering the period between November 2017 and January 2018, the researchers' predictions for the tweets conformed with the ground truth established by fact checkers about 73 per cent of the time.
The boffins found that correct statements contained more positive feelings while incorrect statements were more evasive and had more negations, tentative words, and comparisons. Also, fewer # and @ symbols appeared in incorrect tweets.
Tweets with money-related words were found to be more likely to be false while tweets with religious terminology were less likely to be false.
This technique could be used to help journalists and fact checkers evaluate the veracity of social media content, the researchers suggest, and they believe it can made more accurate by combining it with other lie detection methods such as keystroke analysis.
However, they caution that anyone could use this approach to construct a lie detector for a specific person. "Therefore, these results also constitute a warning to all posting a wealth of private information online," they conclude. ®