Using the Internet to Forecast the Flu

John G. Bartlett, MD


July 16, 2009

Early detection and a rapid response to seasonal and pandemic influenzas can improve outcomes and slow the spread of both. This Viewpoint discusses a study that sought to determine whether monitoring health-seeking behavior on the Internet can be used to predict epidemics.

Detecting Influenza Epidemics Using Search Engine Query Data

Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L
Nature. 2009;457:1012-1014

Article Summary

The authors aggregated historical logs of online Web search queries submitted between 2003 and 2008 and computed weekly counts for the 50 million most common search queries in the United States. The goal was to develop a simple model that would estimate the probability that a random physician in a specific region of the United States would see a patient with influenza. The historical data used for matching was the CDC US Influenza Sentinel Provider Surveillance Network.

The topics in search queries that were found to correlate most with the CDC influenza epidemiology data were (in rank order): influenza complications, cold or flu remedy, influenza symptoms, and flu. The resulting formula showed a good match with the CDC epidemic curve for each season. Using these indicators, the formula then predicted the flu epidemic for 2007-2008 as well as influenza activity in each region of the United States. Most important is that the reporting lag was about 1 day compared with the reporting delay of 2 weeks by the CDC.

The authors concluded that this approach may make it possible to use search queries to detect influenza epidemics in areas with large populations of Web search users.


This was a collaborative study between the CDC and Google, including (the philanthropic branch of Google) founder Larry Brilliant. One can well imagine the potential utility of this technology in multiple areas of medicine, including epidemics, natural disasters, and bioterrorism.