This article is more than 1 year old

Google hopes to cure its 'flu sneezes

Big data needs real data

Google has responded to criticism of its Flu Trends, and decided that as well as big data, it needs real data.

As noted earlier this year, search data alone overpredicts influenza outbreaks, because there are so many symptoms that can lead people to ask the all-seeing eye “do I have the 'flu?”

The response, announced by the Chocolate Factory in this “even better!” blog post, is to use data from the Centre for Disease Control as a data-training input to the models.

As the 'flu season progresses, Google senior software engineer Christian Stefansen writes, the CDC data will provide in-course corrections to deal with the over-prediction problem. The retraining, he writes, reflects “the best performing methods in the literature”.

The value of making the data more adaptive in this way is discussed in this open paper published by The Royal Society. Researchers Tobias Preis and Helen Susannah Moat from the University of Warwick's business school say “when using Google Flu Trends data in combination with historic flu levels, the mean absolute error (MAE) of in-sample ‘nowcasts’ can be significantly reduced by 14.4 per cent, compared with a baseline model that uses historic data on flu levels only”.

Google still believes in the value of the Flu Trends service, since it's geographically fine-grained compared to the historical CDC data, and because there are lots of countries that don't have official influenza tracking. ®

More about

TIP US OFF

Send us news


Other stories you might like