The standardized reference genome sequence shown is from one of the first isolated cases, known as NC_045512v2 or wuhCor1. With more than 80 track datasets across the SARS-CoV-2 reference genome's nearly 30,000 RNA bases, navigation is essential to finding the information you want to see. Below is an example view of the Genome Browser with labeled sections highlighting the navigation, reference sequence, -annotations, and additional available track datasets.

Navigation controls at the top allow users to move left and right and to zoom. The search box allows users to search for particular features or to move to exact genomic coordinates. The RNA sequence is shown only when the view is sufficiently zoomed in. Annotations are shown for data tracks that have been set to visible in the Available Tracks section at the bottom. Tracks can be configured with a right-click or by clicking on their name near the bottom of the page.

This is a view of the SARS-CoV-2 Genome Browser with labeled elements to help with orientation. Interact with this session by clicking on the picture. To read the full caption, please go to our Nature Genetics paper.

Genes and Sequence Alignments

Gene and protein annotations are organized by the contributor, most notably NCBI and UniProt. Having multiple information sources allows a consensus to be formed among datasets. Like many viral genomes, molecular complexity arises from polyproteins rearranging, generating ~29 protein products. Most notable among these is the S (spike) protein which defines coronaviruses and allows entry into our cell membranes. Additional tracks contain information such as interactions between viral proteins and human proteins (protein interact), PDB structures, and RNA structure annotations (Rangan RNA), and more.

-Sequence alignments and conservation data is also available across the SARS-CoV-2 genome, -from large scale views to individual bases and amino acids. There are four main conservation -tracks which compare sequence similarity of 44 bat coronaviruses, 119 vertebrate coronaviruses, +Sequence alignments and conservation data are also available across the SARS-CoV-2 genome, +from large-scale views to individual bases and amino acids. There are four main conservation +tracks that compare sequence similarity of 44 bat coronaviruses, 119 vertebrate coronaviruses, PhyloCSF computed conservation scores, and alignments of 7 human coronaviruses. The tracks have -different displays depending on visibility mode and number of bases on the screen.

+different displays depending on visibility mode and the number of bases on the screen.

Datasets can be turned on by setting the dropdown next to the data track name from "hide" to dense, squish, pack, or full. Then click the button to see these changes in effect. Clicking on a data track name will take you to a description with more information on the dataset, display conventions, methods, and references. Clicking on a particular item will take you to a page with complete information about that item and dataset.

+ alt="Some of the gene and conservation data on the Genome Browser" width="700" height="375">

This Genome Browser display shows some of the gene and conservation tracks available on the SARS-CoV-2 genome. You should be able to see UniProt protein products, regions of interest, and domains all mapped against the SARS-CoV-2 genome. Below those tracks are two different conservation alignments in "squish" and "pack" formats, comparing bat-host and human-host coronavirus sequences with the reference SARS-CoV-2 genome. Interact with this session by clicking on the picture.

Exploring Variation and Immunology Data

The SARS-CoV-2 Genome Browser displays data on variation within SARS-CoV-2 from UniProt, GenBank, GISAID, Nexstrain, and other providers. These data cover global trends in SARS-CoV-2 variation among all available public sequences, with regional descriptions available through clicking into a particular entry. A few of the most notable tracks under the "Variation and Repeats" section are the Phylogeny: Public track, which shows a continuously updating phylogenetic tree that clusters similar sequences, with the frequency of each mutation shown by the height of the bar at that particular base. Tools are provided to filter these data to show only well-supported mutation calls, set thresholds for minor-allele frequency, and display data for specific clades.

Another track is the spike protein mutations from community annotations, highlighted as amino acid changes. The Genome Browser also has the Variants of Concern track, which pinpoints each accumulated mutation that defines 4 strains of SARS-CoV-2 of particular concern, labeled based on lay terms (such as 'California variant') as well as the using the lineage defined by the Pangolin software (such as 'B.1.1.7').

The Genome Browser also provides 12 immunology datasets that can inform potential therapeutic targets or public health risks. Protein epitopes are highlighted in the genome by multiple tracks, including those from the Immune Epitope Database (IEDB) and from a study of COVID+ patients. Of particular interest are the datasets describing surveys of antibody response across -a variety of SARS-CoV-2 variants in the receptor binding domain +a variety of SARS-CoV-2 variants in the receptor-binding domain (Antibody Escape Mutations).

This image is an example of some of the variation data tracks - that can be displayed on the SARS-CoV-2 genome, zoomed into the receptor binding domain (RBD) of + that can be displayed on the SARS-CoV-2 genome, zoomed into the receptor-binding domain (RBD) of the Spike protein. Validated epitopes are displayed in black that may be a target for therapeutic antibodies. In red and black, antibody escape scores are are shown for each genome position. Smaller tick marks show amino acid or nucleotide changes from different sources, with more information available by clicking into the item.

Genetic Contact Tracing with UShER

The UCSC Genome Browser also has developed a tool that allows placement of SARS-CoV-2 sequences onto existing phylogenetic trees far faster than previous methods, allowing instantaneous tracing of strains and transmission events. This tool is called Ultrafast Sample placement on Existing tRees (UShER) and exists as an interactive web-tool to compare sequences and link to existing public phylogenetic trees.

- + alt="Example of the UShER phylogeny placement tool" width="1000" + height="275">

After uploading a Fasta file, the tool returns a page with quality -metrics like number of bases aligned, number of Ns, and number of maximally parsimonious -placements along with the lineage and clade of the neares neighbor. Colored boxes highlight +metrics such as: number of bases aligned, number of Ns, and number of maximally parsimonious +placements along with the lineage and clade of the nearest neighbor. Colored boxes highlight possible quality issues, green meaning this was a high confidence placement.

Next, you can view your aligned SARS-CoV-2 sequence genotypes along with their closest known -relatives among the 150,000+ public sequences. You can look at compare among your uploaded samples or trace possible -transmission vectors using mutational signitures.

+relatives among the 150,000+ public sequences. You can look at compare among your uploaded +samples or trace possible transmission vectors using mutational signatures.

The uploaded sequences are highlighted in blue alongside their most closely aligned public sequences. You can investigate genotypes and relationships between samples.

API, SQL, Downloads, Custom Tracks

Custom Tracks, Downloads, API, and SQL features

+Along with a suite of data tracks, filters, and visualization options for the SARS-CoV-2 +genome, the UCSC Genome Browser offers many additional ways to interface with our data. +You can upload your data on the reference genome in nearly any format with our +Custom Track tool. If you have unaligned sequence, you can use our BLAT +sequence alignment tool to get coordinates and basewise comparison with any reference genome. + We have a JSON API which return. + +

Introduction to the SARS-CoV-2 Genome Browser

Genome Browser Orientation and Navigation

Genes and Sequence Alignments

Exploring Variation and Immunology Data

Genetic Contact Tracing with UShER

API, SQL, Downloads, Custom Tracks

Custom Tracks, Downloads, API, and SQL features

Support Docs, Contact Us