An evolutionary portrait of the progenitor SARS-CoV-2 and its dominant offshoots in COVID-19 pandemic.


Severe acute respiratory syndrome coronavirus 2, SARS-CoV-2, was quickly identified as the cause of COVID-19 disease soon after its earliest reports. The knowledge of the contemporary evolution of SARS-CoV-2 is urgently needed not only for a retrospective on how, when, and why COVID-19 has emerged and spread, but also for creating remedies through efforts of science, technology, medicine, and public policy. Global sequencing of thousands of genomes has revealed many common genetic variants, which are the key to unraveling the early evolutionary history of SARS-CoV-2 and tracking its global spread over time. However, our knowledge of fundamental events in the evolution and spread of this coronavirus remains grossly incomplete and highly uncertain. Here, we present the heretofore cryptic mutational history, phylogeny, and dynamics of SARS-CoV-2 from an analysis of tens of thousands of high-quality genomes. The reconstructed mutational progression is highly concordant with the timing of coronavirus sampling dates. It predicts the progenitor genome whose earliest offspring without any non-synonymous mutations were still spreading worldwide months after the report of COVID-19. Over time, mutations gave rise to seven major lineages that spread episodically, some of which arose in Europe and North America after the genesis of the ancestral lineages in China. Mutational barcoding establishes that North American coronaviruses harbor very different genome signatures than coronaviruses prevalent in Europe and Asia that have converged over time. These spatiotemporal patterns continue to evolve as the pandemic progresses and can be viewed live online.

MIDAS Network Members