This is our blog. We write about cool experiments, interesting case studies, open talks and technicalities.
Grab the RSS feed
Back

2011-12-06

Real-time Syndromic Surveillance of Social Media for Disease Symptoms related to Seasonal Influenza

  • We do real-time monitoring of  social media for disease symptoms
  • there is still no evidence of an outbreak of the seasonal flu in Sweden
  • we observe, however, an increasing trend in the intensity of symptoms

The inevitable influenza season will soon come knocking on our doors. How do we know when it has started, and how do we know just how severe it is? To this end, there are on-line tools for syndromic surveillance, aiding individual medical practitioners and national disease control centers alike to combat the spread of influenza. Internationally, perhaps the most well-known monitoring service is Google Flu Trends. Nationally, Influensakoll keeps track of the current state of flu-related illness in Sweden. Along the same lines, research carried out at the Swedish Institute for Infectious Disease Control (SMI) show the feasibility of using search queries submitted to the medical web site Vårdguiden for outbreak detection and monitoring. SMI also publishes weekly influenza reports based on input from labs and sentinels.

In addition, there is a growing effort in the research community of mining on-line social media, mostly Twitter, in English, and only by using keywords with the purpose of facilitating early-warning and outbreak detection to be used by health authorities in their planning and conducting targeted counter-measures to epidemic diseases.  Another interesting approach is that taken by the Iowa Electronic Health Market which is a prediction market for syndromic surveillance.

While the above mentioned services and research rely on either active participation on behalf of the users, or on keyword matching in social media feeds with the purpose of finding patterns, we’ve taken a different route to finding out the state of illness of Sweden. We’ve enhanced the barometer introduced earlier with concepts (not keywords) corresponding to a range of disease symptoms such as migraine, fever, expectorate, headache, nausea, sore throat, and head cold, facilitating the triangulation of more complex illnesses without having to wait for the bloggers, tweeters, forum participants, and facebookers out there to become so ill that they either actively seek answers related to their health condition, or start communicating using the actual name of the disease.

Our approach attempts to catch signs of illness early on, expressed as the participants in social media do what they usually do, that is, communicate with their peers. By focusing on the symptoms, we believe it is possible to get an early-warning of the seasonal flu, before anyone realizes it is what they are actually talking about. The image below illustrates the discrepancy between the score for the concept of influenza  (the green, nearly flat line at the bottom of the graph) and the scores for some of the symptoms of influenza; expectoration (blue line), headache (red line), and fever (yellow line). Clearly, people have not yet experienced the flu strongly enough to talk about it, although they talk loudly about some of its symptoms. Note that the graph reveals an increasing trend in the intensity of the symptoms! The Ethersource-based barometer thus serves as a complement to other surveillance tools in that it picks up on trends of (combinations of) symptoms earlier.

Expressions of the concepts expectoration, headache, fever, and influenza in Swedish social media, early December 2011. Note that while the influenza score is constantly low, the other three symptoms vary with the time-of-day, taking precedence over each other in various ways. Clearly, people have not yet experienced the flu strong enough to talk about it.

Expressions of the concepts expectoration, headache, fever, and influenza in Swedish social media, early December 2011. Note that while the influenza score is constantly low, the other three symptoms vary with the time-of-day, taking precedence over each other in various ways.

Gavagai’s Ethersource technology allows for the kind of syndromic surveillance of disease symptoms described in this blog post to be carried out in real-time, in any language.

Category: case studies, sentiment analysis