Abstract
The novel human coronavirus disease COVID-19 has become the fifth documented pandemic since the 1918 flu pandemic. COVID-19 was first reported in Wuhan, China, and subsequently spread worldwide. The coronavirus was officially named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by the International Committee on Taxonomy of Viruses based on phylogenetic analysis. SARS-CoV-2 is believed to be a spillover of an animal coronavirus and later adapted the ability of human-to-human transmission. Because the virus is highly contagious, it rapidly spreads and continuously evolves in the human population. In this review article, we discuss the basic properties, potential origin, and evolution of the novel human coronavirus. These factors may be critical for studies of pathogenicity, antiviral designs, and vaccine development against the virus.
Pandemic
Currently, people all over the world have been affected by coronavirus disease 2019 (COVID-19), which is the fifth pandemic after the 1918 flu pandemic. As of now, we can trace the first report and subsequent outbreak from a cluster of novel human pneumonia cases in Wuhan City, China, since late December 2019. The earliest date of symptom onset was 1 December 2019. The symptomatology of these patients, including fever, malaise, dry cough, and dyspnea, was diagnosed as viral pneumonia [1,2]. Initially, the disease was called Wuhan pneumonia by the press because of the area and pneumonia symptoms. Whole-genome sequencing results showed that the causative agent is a novel coronavirus. Therefore, this virus is the seventh member of the coronavirus family to infect humans [3]. The World Health Organization (WHO) temporarily termed the new virus 2019 novel coronavirus (2019-nCoV) on 12 January 2020 and then officially named this infectious disease coronavirus disease 2019 (COVID-19) on 12 February 2020. Later, the International Committee on Taxonomy of Viruses (ICTV) officially designated the virus as SARS-CoV-2 based on phylogeny, taxonomy and established practice [4]. Subsequently, human-to-human transmission of COVID-19 occurring within Hong Kong has been shown in clinical data [5]. Since COVID-19 initially emerged in China, the virus has evolved for four months and rapidly spread to other countries worldwide as a global threat. On 11 March 2020, the WHO finally made the assessment that COVID-19 can be characterized as a pandemic, following 1918 Spanish flu (H1N1), 1957 Asian flu (H2N2), 1968 Hong Kong flu (H3N2), and 2009 Pandemic flu (H1N1), which caused an estimated 50 million, 1.5 million, 1 million, and 300,000 human deaths, respectively [[6], [7], [8], [9]] [Fig. 1].
Fig. 1. A timeline of five pandemics since 1918 and the globally circulating viruses afterward.
Virology – morphology, gene structure and replication
SARS-CoV-2 is an enveloped and spherical particle approximately 120 nm in diameter containing a positive-sense single-stranded RNA genome. It belongs to the subfamily Coronavirinae, family Coronavirdiae, and order Nidovirales. The RNA genome of SARS-CoV-2 contains a 5′ methyl-guanosine cap, poly (A)-tail, and 29,903 nucleotides according to WH-Human 1 coronavirus (WHCV) [3,10]. It is classified as a beta-coronavirus (βCoV) [lineage B] and is the seventh coronavirus to infect humans, following 2 αCoV (HCoV-229E and HKU-NL63) and 4 βCoV (HCoV-OC43 [lineage A], HCoV-HKU1 [lineage A], severe acute respiratory syndrome SARS-CoV [lineage B] and Middle East respiratory syndrome MERS-CoV [lineage C]) [[11], [12], [13], [14]]. It has been shown that bats and rodents are the gene sources of most αCoVs and βCoVs, however, avian species are the gene sources of most δCoVs and γCoVs by evolutionary analyses. The human coronavirus (HCoV) strains HCoV-NL63, HCoV-229E, HCoV-HKU1, and HCoV-OC43 usually cause mild, self-limiting upper respiratory tract infections, such as the common cold [15,16]. However, SARS-CoV, MERS-CoV, and SARS-CoV-2 can cause severe acute respiratory syndrome and result in life-threatening disease [[17], [18], [19]] [Table 1].
SARS-CoV-2 transcribes nine subgenomic RNAs, and its genome comprises a 5′ untranslated region including a 5′ leader sequence; an open reading frame (ORF) 1a/ab encoding nonstructural proteins (nsp) for replication; four structural proteins including spike (S), envelope (E), membrane (M) and nucleocapsid (N); several accessory proteins such as ORF 3a, 6, 7a/b, and 8; and a 3′ untranslated region. The replicase polyprotein pp1a/ab encodes and is proteolytically cleaved into 16 putative nsps, including nsp3 (papain-like protease), nsp5 (3C-like protease), nsp12 (RNA-dependent RNA polymerase [RdRp]), nsp13 (helicase), and other nsps [10,13,20]. The spike glycoprotein of SARS-CoV-2 binds to angiotensin-converting enzyme 2 (ACE2) in human and Chinese horseshoe bats, civet for cell entry, that is also dependent on S protein priming by the serine protease TMPRSS2. A similar panel of mammalian cell lines can be infected with SARS-CoV-2-S and SARS-CoV-S [[21], [22], [23], [24]]. The spike protein could be cleaved by host proteases into the S1 and S2 subunits, which are responsible for receptor recognition and membrane fusion, respectively. S1 also can be divided into an N-terminal domain (NTD) and a C-terminal domain (CTD). The S1 CTD of SARS-CoV-2, but not the NTD, showed strong affinity for human ACE2 (hACE2). The receptor-binding domain (RBD) within SARS-CoV-2 CTD is the key region that interacts with the hACE2 receptor with higher affinity than the RBD of SARS-CoV by 10- to 20-fold with kinetic quantification [23,25]. The putative life cycle of SARS-CoV-2 in host cells begins from spike protein and hACE2 receptor binding. The conformational change in the S protein after receptor binding facilitates viral envelope fusion with the cell membrane through the endosomal pathway. The viral RNA genome is then released into the cytoplasm and translated into viral replicase polyproteins pp1a and 1 ab, which can be cleaved into small products by virus-encoded proteinases. The polymerase transcribes a series of subgenomic mRNAs by discontinuous transcription. The subgenomic mRNAs are finally translated into viral structural proteins. The S, E and M proteins enter the endoplasmic reticulum (ER) and Golgi apparatus, and the N protein is combined with the positive-stranded genomic RNA to form a nucleoprotein complex. The structural proteins and nucleoprotein complex are assembled with the viral envelope at the ER–Golgi intermediate compartment. The newly assembled viral particles are then released from the infected cell [Fig. 2].
Fig. 2. The putative life cycle of SARS-CoV-2.
Ecology - the potential origin of the virus
All human coronaviruses have animal origins, namely, natural hosts. Bats may be the natural hosts of HCoV-229E, SARS-CoV, HCoV-NL63, and MERS-CoV. Furthermore, HCoV-OC43 and HKU1 probably originated from rodents [[26], [27], [28]]. Bats are undoubtedly important and the major natural reservoirs of alpha-coronaviruses and beta-coronaviruses [29]. Domestic animals can suffer from disease as intermediate hosts that cause virus transmission from natural hosts to humans; for example, SARS-CoV and MERS-CoV crossed the species barriers into masked palm civets and camels, respectively [30,31] [Table 1]. SARS-CoV-2 sequenced at the early stage of the COVID-19 outbreak only shares 79.6% sequence identity with SARS-CoV through early full-length genomic comparisons. However, it is highly identical (96.2%) at the whole-genome level to Bat-CoV RaTG13, which was previously detected in Rhinolophus affinis from Yunnan Province, over 1500 km from Wuhan [21]. Bats are likely reservoir hosts for SARS-CoV-2; however, whether Bat-CoV RaTG13 directly jumped to humans or transmits to intermediate hosts to facilitate animal-to-human transmission remains inconclusive. No intermediate host sample was obtained by scientists in an initial cluster of infections of the Huanan Seafood and Wildlife Market in Wuhan, where the sale of wild animals may be the source of zoonotic infection. Furthermore, the earliest three patients with symptom onset had no known history of exposure to the Huanan market [1]. Therefore, there may be multiple sources of COVID-19 in the beginning. According to previous studies by metagenomic sequencing for the samples from Malayan pangolins (Manis javanica) in Guangxi and Guangdong, China, it has been suggested that pangolins might be the intermediate hosts between bats and humans because of the similarity of the pangolin coronavirus to SARS-CoV-2 [32,33]. However, the additional phylogenetic analyses effectively trace COVID-19 infection sources. In addition to the zoonotic origins of SARS-CoV-2 by natural evolution, there are still some disputes about the origin of the virus because its spike protein seems to perfectly interact with the human receptor in contributing to human-to-human transmission after evolution in a short period. Nevertheless, more direct evidence is required to clarify the arguments.
Evolution of SARS-COV-2 during the past few months
Replication of RNA viruses could generate mutations due to the low proofreading ability of their RdRP. The genome variations generated by viral RdRP could be beneficial for an emerging virus to adapt to new hosts. However, previous studies have shown that the mutation rates could vary in RNA viruses [34]. The synonymous substitution rate for coronaviruses might be approximately 1 × 10−3/synonymous site/year, which is lower than some other RNA viruses. The mutation rate during coronavirus replication could be partially controlled by the viral exoribonuclease nsp14 [35,36]. Nevertheless, SARS-CoV-2 has been continuously evolving to different groups worldwide during the pandemic.
According to the information of nCoV-19 (SARS-CoV-2) sequences submitted to the GISAID database in January 2020, the virus was first collected in late December 2019 from Wuhan, China. However, those viral sequences varied from the latest submitted sequence collected in early April 2020 from North America. Since the viral sequences continuously change, the construction of a phylogenetic network is crucial to investigate the adaption of the virus in different human populations and environments. Although the virus keeps evolving within humans who could also be susceptible to other human coronaviruses, recombination between SARS-CoV-2 and old human coronaviruses, such as HCoV-229E, OC43, NL63, and HKU1, has not been found. Nevertheless, a recent study claimed that three genetic types of the virus have been circulating globally [37]. The study demonstrated that the genotypes could also correlate to the geographic locations, while the sample size and analysis methods in the study are still being argued in the research field [38]. Therefore, it is still unclear whether the evolution of SARS-CoV-2 could be affected by replication environments, such as genetic and immunological restrictions in different human populations. With evolutionary pressure, the selection of SARS-CoV-2 mutations will be ongoing. The investigation of the geographic patterns of SARS-CoV-2 variations will provide information on vaccine development for different populations.
Conclusions
Human coronaviruses usually cause mild upper respiratory diseases. However, in the past two decades, two coronaviruses transmitted from animals, SARS-CoV and MERS-CoV, have caused severe pneumonia and death in humans. In addition, since late December 2019, the COVID-19 pandemic has spread globally and consequently resulted in at least 282,719 deaths worldwide as of May 11, 2020 [39]. Due to the high sequence homology with a coronavirus isolated from bats, SARS-CoV-2 is considered a zoonotic origin coronavirus. Undoubtedly, SARS-CoV-2 has become the fifth human coronavirus, and it is possible that this virus will continuously circulate in the human population in the future. Because specific antiviral treatments and vaccines are still under development, testing, quarantine, and social distancing are encouraged to prevent virus spread. Nonetheless, since the virus keeps mutating and evolving during the pandemic, studies on viral pathogenicity, treatments and prophylactic vaccines should closely consider the genetic characteristics of the virus.
Conflict of interest statement
The authors declare no conflicts of interest.
Acknowledgements
This work was financially supported by the Research Center for Emerging Viral Infections from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education in Taiwan and the Ministry of Science and Technology, Taiwan (MOST 108-3017-F-182-001).