bioRxiv preprint 23 impacted many aspects of human society. Here, we analyzed genetic variation of SARS- 24 CoV-2 and its related coronavirus and found the evidence of intergenomic recombination. 25 After correction for mutational bias, analysis of 137 SARS-CoV-2 genomes as of 2/23/2020 26 revealed the excess of low frequency mutations on both synonymous and nonsynonymous 27 sites which is consistent with recent origin of the virus. In contrast to adaptive evolution 28 previously reported for SARS-CoV in its brief epidemic in 2003, our analysis of SARS-CoV-29 2 genomes shows signs of relaxation of selection. The sequence similarity of the spike 30 receptor binding domain between SARS-CoV-2 and a sequence from pangolin is probably 31 due to an ancient intergenomic introgression. Therefore, SARS-CoV-2 might have cryptically 32 circulated within humans for years before being recently noticed. Data from the early 33 outbreak and hospital archives are needed to trace its evolutionary path and reveal critical 34 steps required for effective spreading. Two mutations, 84S in orf8 protein and 251V in orf3 35 protein, occurred coincidentally with human intervention. The 84S first appeared on 1/5/2020 36 and reached a plateau around 1/23/2020, the lockdown of Wuhan. 251V emerged on 37 1/21/2020 and rapidly increased its frequency. Thus, the roles of these mutations on 38 infectivity need to be elucidated. Genetic diversity of SARS-CoV-2 collected from China was 39 two time higher than those derived from the rest of the world. In addition, in network analysis, 40 haplotypes collected from Wuhan city were at interior and have more mutational connections, 41 both of which are consistent with the observation that the outbreak of cov-19 was originated 42 from China.
SUMMARY
44In contrast to adaptive evolution previously reported for SARS-CoV in its brief 45 epidemic, our analysis of SARS-CoV-2 genomes shows signs of relaxation of selection. The 46 sequence similarity of the spike receptor binding domain between SARS-CoV-2 and a 47 sequence from pangolin is probably due to an ancient intergenomic introgression. Therefore, 48 SARS-CoV-2 might have cryptically circulated within humans for years before being 49 recently noticed. Data from the early outbreak and hospital archives are needed to trace its 50 evolutionary path and reveal critical steps required for effective spreading. Two mutations, 51 84S in orf8 protein and 251V in orf3 protein, occurred coincidentally with human 52 intervention. The 84S first appeared on 1/5/2020 and reached a plateau around 1/23/2020, the 53 lockdown of Wuhan. 251V emerged on 1/21/2020 and rapidly increased its frequency. Thus, 54 the roles of these mutations on infectivity need to be elucidated. 55 56 57 A newly emerging coronavirus was detected in patients during an outbreak of 58 respiratory illnesses starting in mid-December of 2019 in Wuhan, the capital of Hubei 59 Province, China [1 , 2, 3]. Due to the similarity of its symptoms to those induced by the 60 severe acute respiratory syndro...