This is our blog. We write about cool experiments, interesting case studies, open talks and technicalities.
Grab the RSS feed
talks and presentations


A brief history of word embeddings (and some clarifications)

One of the strongest trends in Natural Language Processing (NLP) at the moment is the use of word embeddings, which are vectors whose relative similarities correlate with semantic similarity. Such vectors are used both as an end in itself (for computing similarities between terms), and as a representational basis for downstream NLP tasks like text classification, document clustering, part of speech tagging, named entity recognition, sentiment analysis, and so on. That this is a trend is obvious when looking at the proceedings from the recent large conferences in NLP, e.g. ACL or EMNLP. For the first time (ever), semantics was…


Social Media Syndromic Surveillance

The Public Health Agency of Sweden has an initiative called Hälsorapport, which is part of the European system Influenzanet, whose overall goal is “monitor the activity of influenza-like-illness (ILI) with the aid of volunteers via the internet.” The goal of Hälsorapport is similarly to monitor the spreading of diseases in Sweden and to inform the general public, the health care system, and other government agencies about the current health status of Sweden. The monitoring is done by eliciting weekly reports from volunteers regarding their general health status, and in particular regarding any symptoms they might have. According to the website,…


Gavagai on the 33-list

We are pleased to announce that Gavagai made it to the prestigious 33-list, Sweden’s top list of innovative high-tech companies. The list is compiled yearly by Ny Teknik and Affärsvärlden, Sweden’s leading technical and business magazines respectively. The honor was awarded by the editor in chief of Affärsvärlden, Jon Åsberg. —There is no short-cut to understanding the wealth of information found in human language. This requires specialized technology, which is what we build at Gavagai. Our goal is to build tools that allow every creative developer to tap into this knowledge, says Dr. Jussi Karlgren, co-founder of Gavagai.


Miserable Monday and the Effect of Vacation in Swedish Social Media

Recently, we found out that Miserable Monday might not be anything but a myth. As avid fans of the idea of a complete banishment of Mondays, it will take more than a couple of news articles to convince us. Luckily, Ethersource is more than ready to clear up any doubts. For some time, we have been monitoring the Swedish domain of social media, and how people are feeling when talking about themselves. The curves have been steadily working their ups and downs. However, these past few months we have been noticing a very curious occurrence. First, let’s take a look at…


Measuring the popularity of the contestants in the Eurovision Song Contest using Twitter

In this post, we confirm that Loreen is well placed to win the popular vote in the Eurovision Song Contest final 2012. We use Twitter to measure the popularity of the contestants in ESC 2012. When scaling with Twitter penetration, Sweden gets the highest relative popularity score. This is in line with current betting odds, which unanimously rank Sweden as the most likely winner. Gavagai has previously made accurate forecasts of the distribution of the popular vote in the national ESC final. We have previously shown in this blog that Ethersource monitoring of on-line sentiment can predict the popular vote…


Monitoring on-line social media for as-it-happens customer churn related to mobile network operators in the US

Churn is a measure of customers leaving a subscription-based service over time. In this post, we use Ethersource to demonstrate real-time monitoring of churn-propensity related to telecom services; characterize customer churn by means of annoyance, uncertainty, change, and negativity; identify and extract, in real time, the source documents provoking the churn for a service (in this particular case, a rumour surrounding Sprint Nextel’s service). A challenging question in subscription-based industry segments is: As a service provider, how do I detect that a churn-provoking event is taking place, in a timely manner permitting me to act on that information, in order to short-circuit…


Gavagai analyses the Greeks’ attitudes toward the cancelled referendum and the Eurozone

Greece has now officially scrapped plans for a referendum on the Euro bailout plan. Our research shows that a small majority (53%) of Greeks did not want the referendum, the exact subject matter and formulation of which remains unspecified and unclear. In our view, a majority (79%) of Greeks wants to remain in the eurozone. We note, however, that willingness to keep the Euro falls dramatically when the issue is raised in the context of austerity measures, which leads us to believe that if a referendum – on whether to remain in the Eurozone subject to harsh austerity conditions –…


Analysis of buzz, hatred, and associations during the unraveling of Håkan Juholts accommodation reimbursements affair

In this post, we analyze the mention frequency, strong negative sentiment or hatred and terminology associated with the leader of the Swedish Social Democrats, Håkan Juholt, during the unraveling of his accommodation reimbursements affair. The analysis is made on the Swedish blogosphere between October 1 and 19, 2011, by using Ethersource technology and our proprietary associations engine. The short version of the story is: Juholt went into the accommodation affair in early October with a fairly low buzz in the blogosphere, and a reasonable level of strong negative sentiment or hatred expressed towards him considering he is a leading politician…


New words in New Text

New Text is what we like to call the sort of spontaneous non-edited material we spend much of our time processing. We contrast this primarily with traditional text from editorial sources. There are interesting differences between new text and traditional text — and this has been the subject of much debate in philological, sociological, and to some extent even computational circles. Much of what has been said is interesting, much is pure piffle, and we have made our own pronouncements about what sort of changes we believe are ahead (this one prononuncement in Swedish). We expect we will have reason…


The killing of Mashaal Tammo through the eyes of Arabic social media

In this post, we show three things: The possibility of using Ethersource to monitor Arabic social media to detect violent on-line chatter, and to identify the real-worlds events underlying the resulting signal. On the evening of Friday, October 7 2011, Kurdish opposition politician and founder of the Kurdish Future Movement Party Mashaal Tammo was shot dead by masked men in his home, in north eastern Syria. His killing was soon attributed to the regime of Syria. The next day, Saturday, October 8, the funeral party for Tammo, with 50,000 to 100,000 attendants, turned into the largest gathering of protesters since…