Gamergate tweets and sentiment analysis
The heated #GamerGate debate raging in social media in recent weeks has recently emerged into editorial media as well. A dive into the actual data done by Brandwatch and published by Newsweek last week found that indeed more of the material published on Twitter under that hashtag is vitriolic and confrontative rather than a discussion on media ethics.
“If GamerGate is about ethics among journalists, why is the female developer receiving 14 times as many outraged tweets as the male journalist? … The discrepancies there seem to suggest GamerGaters cares less about ethics and more about harassing women.”
This is a depressing exposition of a destructive culture, and also an excellent and commendable example of data journalism! We would like to see more such studies!
We have a point to make on the methodology, however.
Using an algorithm that looks for positive and negative words, BrandWatch found most tweets were neutral in sentiment. “If our algorithm doesn’t identify a tweet as positive or negative, it categorizes it as neutral,” a Brandwatch representative told Newsweek.
This, of course, is a reasonable strategy. If no attitude is expressed, as far as the algorithm goes, it makes sense to either not label the text at all, or to label it neutral with respect to the attitude in question. In the reported study only about 10 per cent of the tweets are assigned sentiment.
But this is a telling example of the poverty of analysis expected of textual sentiment analysis. A quick glance at the data shows that the 10 per cent is a much too careful analysis, erring on the side of precision at cost to recall. There is much more positive and negative to be found, and besides, “Positive” and “Negative” are only the first scratches on the multi-dimensional and many-faceted space of human emotion, mood, and sentiment.
When we run a similar set of data – thank you @waxpancake – we get very different results. We do not limit ourselves to positive and negative sentiments. We assign some attitude to about 70% of the tweets.
We found sentiments such as fear (… I am a terrified feminist and need to project this fear on to everyone else…), disgust (… I’m sick of these people using women to deflect from …), violence (… somehow the death threats we’ve received don’t get front page …) prominent among them, but also emotions such as love (… it’s beautiful and makes me want to cry …), sense of duty (…does not mean their actions will represent social justice…), and incredulity ( … so weird, weird, just weirdo bizzare … ).
We really feel very strongly about this. Analysis of text along the one dimension of positive and negative is misguided and either misses out on the action (as in the study reported by Newsweek) or forces everything on to a very simplistic model of emotion (which would be worse). We should accept the fact that human emotional and attitudinal behaviour is complex and we should model it accordingly!