A spatial hierarchical model for integrating and bias-correcting data from passive and active disease surveillance systems.


Disease surveillance data are important for monitoring disease burden and occurrence, and for informing a wide range of efforts to improve population health. Surveillance for infectious diseases may be conducted passively, relying on reports from healthcare facilities, or actively, involving surveys of the population at risk. Passive surveillance typically provides wide spatial coverage, but is subject to biases arising from differences in care-seeking behavior, diagnostic practices, and under-reporting. Active surveillance minimizes these biases, but is typically constrained to small areas and subpopulations due to resource limitations. Methods based on linkage of individual records between passive and active surveillance datasets provide a means to estimate and correct for the biases of each system, leveraging the size and coverage of passive surveillance and the quality of data in active surveillance. We develop a spatial Bayesian hierarchical model for bias-correcting data from both systems to yield an improved estimate of disease measures after adjusting for under-ascertainment. We apply the framework to data from a passive and an active surveillance system for pulmonary tuberculosis (PTB) in Sichuan, China, and estimate the average sensitivity of the active surveillance system at 70% (95% credible interval: 62%, 78%), and the passive system at 30% (95% CI: 24%, 35%). Passive surveillance sensitivity exhibited considerable spatial variability, and was positively associated with a site's gross domestic product per capita. Bias-corrected estimates of county-level PTB prevalence in the province in 2010 identified regions in the southeast with the highest PTB burden, yielding different geographic priorities than previous reports.

MIDAS Network Members