Social Media Syndromic Surveillance
The Public Health Agency of Sweden has an initiative called Hälsorapport, which is part of the European system Influenzanet, whose overall goal is “monitor the activity of influenza-like-illness (ILI) with the aid of volunteers via the internet.” The goal of Hälsorapport is similarly to monitor the spreading of diseases in Sweden and to inform the general public, the health care system, and other government agencies about the current health status of Sweden. The monitoring is done by eliciting weekly reports from volunteers regarding their general health status, and in particular regarding any symptoms they might have. According to the website, there are currently 3558 volunteers providing weekly reports about their health. The following plot demonstrates a result from the project. The orange line shows the occurrence of influenza-like diseases (no clear trend thus far for 2014), and the blue line shows the occurrence of respiratory infections: apparently there was a significant outbreak of respiratory infections in Sweden during the first weeks of September.
This is a great initiative; the recent outbreak of Ebola is a chilling reminder of the importance of syndromic surveillance, and the best way to understand how people feel is to ask them how they feel. Asking some 3000 people is a very good start!
Now imagine you could ask everyone. And imagine you could ask them not only once a week, but all the time. That is sort of what we do when it comes to social media monitoring, but with the difference that we listen to what people say rather than eliciting answers from them. Listening has the advantage of avoiding any elicitation effects, which is the problem that people may overstate (or understate) symptoms when prompted for a report about their health. Furthermore, we listen to the entire Swedish social media feed (which includes the entire Swedish blogoshpere, the main forums, the entire Swedish Twitter feed, and all open posts on Facebook), and we listen all the time; whenever someone posts something on social media in Sweden, we (or rather, our systems) read it. Rather than asking a few thousand people how they feel or what they think, we listen to everyone who posts on social media.
We have previously blogged about how we can use monitoring of discussions in social media to measure flu trends. This is not something we do only as an isolated case study. On the contrary, we continuously monitor expressions of a large number of symptoms in Swedish social media. The plot below shows the frequency of mentions of respiratory symptoms in Swedish social media from 2012 to 2014. Note the pronounced spike for 2014 (the red circle): this is the very same outbreak of respiratory infection during the first weeks of September that was detected by Häslokontroll. Note also that similar spikes occur at the same time period each year.
Since we have no medical expertise, we will abstain from speculating about the causes of these outbreaks of respiratory infection. However, we will make a prediction based on this yearly recurring pattern: we predict that there will be an increase in respiratory infections in Sweden in September 2015 (and in 2016, and in 2017).
The fact that our measures correlate with the report from Hälsorapport demonstrates the viability of using social media for syndromic surveillance. Note the difference between our approach and Google Flu Trends: we monitor the use of terms relating to various symptoms in social media, whereas Google monitors when people use various search terms (on Google). We believe the former approach may lead to earlier outbreak detection, since people typically express themselves very directly and spontaneously in social media, and they post about whatever symptom they might have at the moment, without necessarily realizing they have an infection:
Min hosta dödar mig
— ★Maria Sving★ (@Mariasving) December 4, 2014
(We of course have no idea whether Maria, who writes “my cough is killing me” actually has a respiratory infection or not.)
Another benefit of social media monitoring is the fact that we listen to what people say, all the time. We can monitor expressions of symptoms down to minute resolution, as in the following plot that shows expressions of respiratory symptoms in Swedish social media per minute in the first week of September 2014. Such fine-grained time resolution may be important in critical scenarios.
We believe listening to what everyone says is important. In our case, we use our technology to read what everyone writes. In many cases, this is a viable proxy for the voice of the population, not the least when it comes to syndromic surveillance.