Genomic Diversity of SARS-CoV-2 During Early Introduction into the United States National Capital Region.


The early COVID-19 pandemic has been characterized by rapid global spread. In the United States National Capital Region, over 2,000 cases were reported within three weeks of its first detection in March 2020. We aimed to use genomic sequencing to understand the initial spread of SARS-CoV-2, the virus that causes COVID-19, in the region. By correlating genetic information to disease phenotype, we also aimed to gain insight into any correlation between viral genotype and case severity or transmissibility.

We performed whole genome sequencing of clinical SARS-CoV-2 samples collected in March 2020 by the Johns Hopkins Health System, building on methods developed by the ARTIC network. We analyzed these regional SARS-CoV-2 genomes alongside detailed clinical metadata and the global phylogeny to understand early establishment of the virus within the region.

We analyzed 620 samples from the Johns Hopkins Health System collected between March 11-31, 2020, comprising 37.3% of the total cases in Maryland during this period. We selected 143 of these samples for sequencing, generating 114 complete viral genomes. These genomes belonged to all five major Nextstrain-defined clades, suggesting multiple introductions into the region and underscoring the diversity of the regional epidemic. We also found that clinically severe cases had genomes belonging to all of these clades.

We established a pipeline for SARS-CoV-2 sequencing within the Johns Hopkins Health system, which enabled us to capture the significant viral diversity present in the region as early as March 2020. Efforts to control local spread of the virus were likely confounded by the number of introductions into the region early in the epidemic and interconnectedness of the region as a whole.

MIDAS Network Members