For pathogens known to transmit across host species, strategic investment in disease control requires knowledge about where and when spillover transmission is likely. One approach to estimating spillover is to directly correlate observed spillover events with covariates. An alternative is to mechanistically combine information on host density, distribution and pathogen prevalence to predict where and when spillover events are expected to occur. We use several case studies at the wildlife-livestock disease interface to highlight the challenges, and potential solutions, to estimating spatio-temporal variation in spillover risk. Datasets on multiple host species often do not align in space, time or resolution, and may have no estimates of observation error. Linking these datasets requires they be related to a common spatial and temporal resolution and appropriately propagating errors in predictions can be difficult. Hierarchical models are one potential solution, but for fine-resolution predictions at broad spatial scales, many models become computationally challenging. Despite these limitations, the confrontation of mechanistic predictions with observed events is an important avenue for developing a better understanding of pathogen spillover. Systems where data have been collected at all levels in the spillover process are rare, or non-existent, and require investment and sustained effort across disciplines. This article is part of the theme issue 'Dynamic and integrative approaches to understanding pathogen spillover'.