Models built from Google search queries were able to adequately estimate malaria activity trends in Thailand, from 2005-2010, according to official malaria case counts reported by WHO. While presenting their own limitations, these search queries may be valid real-time indicators of malaria incidence in the population, as correlations were on par with those of related studies for other infectious diseases. Additionally, this methodology provides a cost-effective description of malaria prevalence that can act as a complement to traditional public health surveillance. This and future studies will continue to identify ways to leverage web-based data to improve public health.
Each model captured the bulk of the variability in officially reported malaria incidence. Correlation in the validation set ranged from 0.75 to 0.92 and AIC values ranged from 808 to 586 for the models. While models using malaria-related and general health terms were successful, one model using only microscopy-related terms obtained equally high correlations to malaria case data trends. The model built strictly of queries provided by Thai physicians was the only one that consistently captured the well-documented second seasonal malaria peak in Thailand.
Internet search query trends have been shown to correlate with incidence trends for select infectious diseases and countries. Herein, the first use of Google search queries for malaria surveillance is investigated. The research focuses on Thailand where real-time malaria surveillance is crucial as malaria is re-emerging and developing resistance to pharmaceuticals in the region.
Official Thai malaria case data was acquired from the World Health Organization (WHO) from 2005 to 2009. Using Google correlate, an openly available online tool, and by surveying Thai physicians, search queries potentially related to malaria prevalence were identified. Four linear regression models were built from different sub-sets of malaria-related queries to be used in future predictions. The models' accuracies were evaluated by their ability to predict the malaria outbreak in 2009, their correlation with the entire available malaria case data, and by Akaike information criterion (AIC).