Using simulated infectious disease outbreaks to inform site selection and sample size for individually randomized vaccine trials during an ongoing epidemic.


We illustrate the added value of models using the motivating example of Zika vaccine trial planning during the 2015-2017 Zika epidemic. We used a stochastic, spatially resolved, transmission model (the Global Epidemic and Mobility model) to simulate epidemics and site-level incidence at 100 high-risk sites in the Americas. We considered several strategies for prioritizing sites (average site-level incidence of infection across epidemics, median incidence, probability of exceeding 1% incidence), selecting the number of sites, and allocating sample size across sites (equal enrollment, proportional to average incidence, proportional to rank). To evaluate each design, we stochastically simulated trials in each hypothetical epidemic by drawing observed cases from site-level incidence data.

Mathematical and statistical models may assist in designing successful vaccination trials by capturing uncertainty and correlation in future transmission. Although many factors affect site selection, such as logistical feasibility, models can help investigators optimize site selection and the number and size of participating sites. Although our study focused on trial design for an emerging arbovirus, a similar approach can be made for any infectious disease with the appropriate model for the particular disease.

Novel strategies are needed to make vaccine efficacy trials more robust given uncertain epidemiology of infectious disease outbreaks, such as arboviruses like Zika. Spatially resolved mathematical and statistical models can help investigators identify sites at highest risk of future transmission and prioritize these for inclusion in trials. Models can also characterize uncertainty in whether transmission will occur at a site, and how nearby or connected sites may have correlated outcomes. A structure is needed for how trials can use models to address key design questions, including how to prioritize sites, the optimal number of sites, and how to allocate participants across sites.

When constraining overall trial size, the optimal number of sites represents a balance between prioritizing highest-risk sites and having enough sites to reduce the chance of observing too few endpoints. The optimal number of sites remained roughly constant regardless of the targeted number of events, although it is necessary to increase the sample size to achieve the desired power. Though different ranking strategies returned different site orders, they performed similarly with respect to trial power. Instead of enrolling participants equally from each site, investigators can allocate participants proportional to projected incidence, though this did not provide an advantage in our example because the top sites had similar risk profiles. Sites from the same geographic region may have similar outcomes, so optimal combinations of sites may be geographically dispersed, even when these are not the highest ranked sites.

MIDAS Network Members