Skip to content

Welcome to VirusWarn

The goal of VirusWarn is to detect emerging virus variants from collected bases of genomes, before their annotation by phylogenetic analysis. It does so by parsing genomes and detecting amino acids mutations in the spike proteins that can be associated with a phenotypic change. The phenotypic changes are annotated according to the knowledge accumulated on previous variants. Owing to the limited size of the genome, convergent evolution is expected to take place.

Virus warn is able to

  • analyze locally a large collection of sequences,

  • report an alert value for each genomic sequence,

  • offer an integrated summary of the results,

  • can be easily and automatically adapted to different viruses,

  • and importantly account for privacy concerns.

Overview

VirusWarn ranks SARS-CoV-2 genome sequences based on their mutational profile of the spike gene. Similarly, it provides a ranking scheme based on the HA segment of Influenza sequences for three subtypes: Influenza A H1N1, Influenza A H3N2, and Influenza B Victoria. For each query sequence, we first infer the amino acid mutations compared to a reference sequence. These are then aggregated into a multidimensional profile according to their overlap with already known mutations based on curated datasets. An alert is raised afterwards, based on the sequence profile and according to certain criteria. Finally, the set of query sequences is combined and the results from the corresponding alert levels are summarized in an HTML report for further investigation.

VirusWarn Workflow

VirusWarn Workflow

Multiple resources are integrated to construct the mutation profiles, to annotate variants, and finally raise an alert. Each query sequence is described by a nine-dimensional mutation profile. This profile consists of the number of substitutions (Subst.), deletions (Del.), or insertions (Ins.) found within each of the following three categories:

  • Mutations of Concern (MOCs): which include substitutions, insertions, and deletions that can alter the potential of the virus regarding its virulence and transmissibility;

  • Regions of Interest (ROIs): which include important regions of the gene, such as antigenic sites, receptor-binding sites, and glycosylation sites;

  • Private Mutations (PMs): which include all substitutions, insertions, and deletions that are neither identified as MOCs nor as ROIs.

The manual scoring is based on a system of sequential rules to classify concerning variants into three levels: high impact, medium impact, or mutation accumulation. The rules are designed using the following simple working hypothesis: if a variant accumulates many mutations, especially in important regions, it is more likely to be concerning. Additionally, we also consider the fact that the sequence can be derived from a variant already identified in a previous season. A strict mode, which ignores medium impact variants, can also be enabled. To ease visual interpretations, we output color-coded alerts:

  • High impact variants that accumulate mutations in the MOCs and ROIs categories are more likely to have concerning effects. They are reported as pink variants (emerging variants from an already known concerning variant, such as VOC, VOI, or VUM for SARS-CoV-2, and Variants from previous season for Influenza), or red variants (other variants with many MOCs and ROIs in SARS-CoV-2 or Influenza).

  • Medium impact variants that could have some concerning properties (a few MOCs and ROIs/PMs) are reported in orange.

  • Mutation accumulation variants that accumulate a high number of ROIs or PMs are reported in yellow.

  • Remaining variants that did not trigger any alert are assigned to the grey category.

VirusWarn Scoring

VirusWarn Scoring