Xianding Deng, Wei Gu, Scot Federman, Louis du Plessis, Oliver G. Pybus, Nuno Faria, Candace Wang, Guixia Yu, Chao-Yang Pan, Hugo Guevara, Alicia Sotomayor-Gonzalez, Kelsey Zorn, Allan Gopez, Venice Servellita, Elaine Hsu, Steve Miller, Trevor Bedford, Alexander L. Greninger, Pavitra Roychoudhury, Lea M. Starita, Michael Famulare, Helen Y. Chu, Jay Shendure, Keith R. Jerome, Catie Anderson, Karthik Gangavarapu, Mark Zeller, Emily Spencer, Kristian G. Andersen, Duncan MacCannell, Clinton R. Paden, Yan Li, Jing Zhang, Suxiang Tong, Gregory Armstrong, Scott Morrow, Matthew Willis, Bela T. Matyas, Sundari Mase, Olivia Kasirye, Maggie Park, Curtis Chan, Alexander T. Yu, Shua J. Chai, Elsa Villarino, Brandon Bonin, Debra A. Wadford, and Charles Y. Chiu
This repository contains the data files, scripts and workflows necessary to reproduce the phylogenetic analyses and figures presented in https://doi.org/10.1101/2020.03.27.20044925. Some of the scripts may need some adjustment depending on the local setup.
Note that because of the GISAID terms of use genomic sequences cannot be shared in this repository. Instead, we make the GISAID accessions available and provide a table of acknowledgements.
The COVID-19 pandemic caused by the novel coronavirus SARS-CoV-2 has spread globally, resulting in over 3 million reported cases worldwide as of April 27th, 2020. Here we investigate the genetic diversity and genomic epidemiology of SARS-CoV-2 in Northern California using samples from returning travelers, cruise ship passengers, and cases of community transmission with unclear infection sources. Virus genomes were recovered from 36 patients diagnosed with COVID-19 infection from Feb 3rd through Mar 15th. Phylogenetic analyses revealed at least XXX different SARS-CoV-2 lineages, suggesting multiple independent introductions of the virus into the state. Virus genomes from passengers on two consecutive excursions of the Grand Princess cruise ship clustered with those from an established epidemic in Washington State, including the WA1 genome representing the first reported case in the United States on January 19th. We also detected evidence for presumptive transmission of SARS-CoV-2 lineages between communities. Thus, the cryptic transmission of SARS-CoV-2 in Northern California until at least mid-March was characterized by multiple transmission chains originating via distinct introductions from international and interstate travel, rather than widespread community transmission of a single predominant lineage. These findings support the universal implementation of widespread testing, contact tracing, social distancing, and travel restrictions to slow SARS-CoV-2 spread in California and other states in the USA.