Estimating COVID-19 Hospitalizations in the United States with Surveillance Data Using a Bayesian Hierarchical Model: A Modeling Study.


Our novel approach to estimate hospitalizations with COVID-19 has potential to provide sustainable estimates for monitoring COVID-19 burden, as well as a flexible framework leveraging surveillance data.

We aimed to provide a method leveraging surveillance data to create a long-term solution to estimate monthly rates of hospitalizations for COVID-19.

We estimated 3,583,100 (90% Credible Interval:3,250,500 - 3,945,400) hospitalizations for a cumulative incidence of 1,093.9 (992.4 - 1,204.6) hospitalizations per 100,000 population with COVID-19 in the United States from May 2020 through April 2021. Cumulative incidence varied from 359 - 1,856 per 100,000 between states. The age group with the highest cumulative incidence was aged ≥85 years (5,575.6; 5,066.4 - 6,133.7). The monthly hospitalization rate was highest in December (183.7; 154.3 - 217.4). Our monthly estimates by state showed variations in magnitudes of peak rates, number of peaks and timing of peaks between states.

In the United States, COVID-19 is a nationally notifiable disease, meaning cases and hospitalizations are reported to the CDC by states. Identifying and reporting every case from every facility in the United States may not be feasible in the long term. Creating sustainable methods for estimating burden of COVID-19 from established sentinel surveillance systems is becoming more important.

We estimated monthly hospitalization rates for COVID-19 from May 2020 through April 2021 for the 50 states using surveillance data from COVID-19-Associated Hospitalization Surveillance Network (COVID-NET) and a Bayesian hierarchical model for extrapolation. Hospitalization rates are calculated from patients hospitalized with a lab confirmed SARS-CoV-2 test during or within 14 days before admission. We created a model for six age groups (0-17, 18-49, 50-64, 65-74, 75-84, and ≥85 years), separately. We identified covariates from multiple data sources that varied by age, state, and/or month, and performed covariate selection for each age group based on two methods, Least Absolute Shrinkage and Selection Operator (LASSO) and Spike and Slab selection methods. We validated our method by checking sensitivity of model estimates to covariate selection and model extrapolation as well as comparing our results to external data.

MIDAS Network Members

Carrie Reed

Team Lead, Applied Research and Modeling
Centers for Disease Control and Prevention

This site is registered on as a development site.