Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function
Authors
Berrio, A; Gartner, V; Wray, G
Abstract
Background
The emergence of a novel coronavirus (SARS-CoV-2) associated with severe acute respiratory disease (COVID-19) has prompted efforts to understand the genetic basis for its unique characteristics and its jump from non-primate hosts to humans. Tests for positive selection can identify apparently nonrandom patterns of mutation accumulation within genomes, highlighting regions where molecular function may have changed during the origin of a species. Several recent studies of the SARS-CoV-2 genome have identified signals of conservation and positive selection within the gene encoding Spike protein based on the ratio of synonymous to nonsynonymous substitution. Such tests cannot, however, detect changes in the function of RNA molecules.
Methods
Here we apply a test for branch-specific oversubstitution of mutations within narrow windows of the genome without reference to the genetic code.
Results
We recapitulate the finding that the gene encoding Spike protein has been a target of both purifying and positive selection. In addition, we find other likely targets of positive selection within the genome of SARS-CoV-2, specifically within the genes encoding Nsp4 and Nsp16. Homology-directed modeling indicates no change in either Nsp4 or Nsp16 protein structure relative to the most recent common ancestor. Thermodynamic modeling of RNA stability and structure, however, indicates that RNA secondary structure within both genes in the SARS-CoV-2 genome differs from those of RaTG13, the reconstructed common ancestor, and Pan-CoV-GD (Guangdong). These SARS-CoV-2-specific mutations may affect molecular processes mediated by the positive or negative RNA molecules, including transcription, translation, RNA stability, and evasion of the host innate immune system. Our results highlight the importance of considering mutations in viral genomes not only from the perspective of their impact on protein structure, but also how they may impact other molecular processes critical to the viral life cycle.