b7afc005b6fd12992034a9e915c1a93d3f75e978 bwick Thu Mar 18 13:37:22 2021 -0700 Creating a copy stub to be edited in the future at the request of Isabel, Alexis, and Clay. refs #27233 diff --git src/hg/htdocs/goldenPath/help/covidBrowser.html src/hg/htdocs/goldenPath/help/covidBrowser.html new file mode 100755 index 0000000..057c9b8 --- /dev/null +++ src/hg/htdocs/goldenPath/help/covidBrowser.html @@ -0,0 +1,268 @@ +<!DOCTYPE html> +<!--#set var="TITLE" value="COVID Browser Intro" --> +<!--#set var="ROOT" value="../.." --> +<!--#include virtual="$ROOT/inc/gbPageStart.html" --> + + +<h1>Introduction to the SARS-CoV-2 Genome Browser</h1> + <div class="row"> + <div class="col-sm-6"> +<p> +The UCSC Genome Browser is an open-source, interactive sequence visualization tool that +has been a cornerstone of genomics since we released the first human genome assembly +20 years ago. Cited in more than 37,000 scientific articles and used by thousands of +researchers each day; it allows for cross-referencing of research, clinical, +and epidemiology data against reference genomes, including +<a href="../../cgi-bin/hgGateway?db=wuhCor1">SARS-CoV-2</a>. This data is continuously +updated and added to as new datasets become available. For a more thorough description, +please reference our <a href="https://www.nature.com/articles/s41588-020-0700-8" target="_blank"> +SARS-CoV-2 Genome Browser Nature Genetics paper</a>. We also post updates and COVID Browser +resources to out <a href="../../covid19.html">COVID-19 Browser home page</a>.</p> +<p> +This guide will go through some of the most important use cases of the SARS-CoV-2 Genome Browser. +These topics include:</p> +<ul> +<li><a href="#nav">Orientation and Navigation</a></li> +<li><a href="#genes">Gene Data and Sequence Alignments</a></li> +<li><a href="#var">Variation and Immunology data</a></li> +<li><a href="#usher">Phylogenetic Contact Tracing using USHER</a></li> +<li><a href="#data">Other tools and data downloads</a></li> +<li><a href="#support">Support and Collaboration</a></li> +</ul> +<p> +For those who prefer a video explanation, we also have the following tutorial:</p> + </div> + <div class="col-md-6"> +<p> + <iframe width="560" height="315" src="https://www.youtube.com/embed/Ee6h0xyZDOM?rel=0" + frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" + allowfullscreen></iframe></p> + </div> +</div> + + +<a name="nav"></a> +<h2>Genome Browser Orientation and Navigation</h2> +<p> +The standardized reference genome displayed on the +<a href=../../hgTracks?db=wuhCor1">COVID Genome Browser</a> is from one of the first isolated +cases, known as <a href="https://www.ncbi.nlm.nih.gov/nuccore/NC_045512.2" +target="_blank">NC_045512v2</a> or wuhCor1. With more than 80 track datasets across the +SARS-CoV-2 reference genome's nearly 30,000 RNA +bases, navigation is essential to finding the information you want to see. Below is an example +view of the SARS-CoV-2 Genome Browser with labeled sections highlighting the navigation, +reference sequence, annotations, and other available track datasets.</p> + +<p> +<b>Navigation controls</b> at the top allow users to move left and right and to zoom. +The <b>search box</b> allows users to search for particular features or to move +to exact genomic coordinates. The <b>RNA sequence</b> is shown at the top only when the view +is sufficiently zoomed in. Annotations are shown for data tracks that have been set to +visible in the <b>available tracks</b> section at the bottom. +Tracks can be configured with a right-click or by clicking on their name near the +bottom of the page.</p> + +<p class="text-center"> + <a href="http://genome.ucsc.edu/s/SARS_CoV2/Figure1"><img class="text-center" + src="../../images/covidBrowserIntroNav.png" + alt="Labeled orientation to the Genome Browser" width="1200" height="594"></a> + <p class="gbsCaption text-center">This is a view of the SARS-CoV-2 Genome Browser (COVID Browser) + with labeled elements to help with orientation. Interact with this session by clicking on + the picture. To read the full caption, please go to our + <a href="https://www.nature.com/articles/s41588-020-0700-8#Fig1" target="_blank"> + Nature Genetics paper</a>. + </p> +</p> + +<a name="genes"></a> +<h2>Genes and Sequence Alignments</h2> +<div class="row"> + <div class="col-md-5"> +<p> +Gene and protein annotations are organized by the contributor, most notably NCBI and UniProt. +Having multiple information sources allows a consensus to be formed among datasets. +Like many viral genomes, molecular complexity arises from polyproteins rearranging, +generating ~29 protein products. Most notable among these is the S (spike) protein which defines +coronaviruses and allows entry into cell membrane. Additional tracks contain information such +as interactions between viral proteins and human proteins (protein interact), PDB structures, and +RNA structure annotations (Rangan RNA), and more.</p> + +<p> +Sequence alignments and conservation data are also available across the SARS-CoV-2 genome, +from large-scale views to individual bases and amino acids. Four conservation +tracks compare sequences with 44 bat coronaviruses, 119 vertebrate +coronaviruses, 7 human coronaviruses, and PhyloCSF computed conservation scores. +The tracks display differently depending on visibility mode and the number of +bases on the screen.</p> + +<p> +Datasets can be turned on by setting the dropdown +next to the data track name from "hide" to dense, squish, pack, or full. Then click +the <button>refresh</button> button to see these changes in effect. +Clicking on a data track name will take you to a description with more information on the +dataset, display conventions, methods, and references. Clicking on a particular item +will take you to a page with complete information about that item and dataset.</p> + </div> + + + <div class="col-md-7"> +<p class="text-center"> + <a href="http://genome.ucsc.edu/s/dschmelt/covidBrowserIntroGenes"><img class="text-center" + src="../../images/covidBrowserIntroGenes.png" + alt="Some of the gene and conservation data on the Genome Browser" width="700" height="375"></a> + <p class="gbsCaption text-center">This Genome Browser display shows some of the gene and + conservation tracks available on the SARS-CoV-2 genome. You should be able to see UniProt + protein products, regions of interest, and domains all mapped against the SARS-CoV-2 genome. + Below those tracks are two different conservation alignments in "squish" and + "pack" formats, comparing bat-host and human-host coronavirus sequences with the + reference SARS-CoV-2 genome. Interact with this session by clicking on the picture. + </p> +</p> + </div> +</div> + +<a name="var"></a> +<h2>Exploring Variation and Immunology Data</h2> +<div class="row"> + <div class="col-md-5"> +<p> +The <a href="../../cgi-bin/hgTracks?db=wuhCor1">SARS-CoV-2 Genome Browser</a> displays data on variation within SARS-CoV-2 +from UniProt, GenBank, GISAID, Nexstrain, and other providers. These datasets cover global trends +in SARS-CoV-2 variation among all available public sequences, with regional descriptions +available through clicking into a particular entry. A few of the most notable tracks under the +"Variation and Repeats" section are the +<a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=sarsCov2PhyloPub">Phylogeny: Public track</a>, which shows a +continuously updating phylogenetic tree that clusters similar sequences, with the frequency of each +mutation shown by the height of the bar at that particular base. Tools are provided to filter these +data to show only well-supported mutation calls, set thresholds for minor-allele frequency, and display +data for specific clades.</p> +<p> +Another track is the +spike protein mutations from community annotations, highlighted as amino acid changes with red +indicating strong antibody escape in receptor-binding domain (RBD) mutation screens. The Genome +Browser has also has the +<a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=variantMuts">Variants of Concern track</a>, which +pinpoints each accumulated mutation that defines 4 strains of SARS-CoV-2 +of particular concern, labeled based on lay terms (such as 'California variant') as well as +the using the lineage defined by the +<a href="https://github.com/cov-lineages/pangolin/" target="_blank">Pangolin software</a> +(such as 'B.1.1.7').</p> + +<p> +The Genome Browser also provides 12 immunology datasets that can inform potential therapeutic +targets or public health risks. Protein epitopes are highlighted in the genome by multiple tracks, +including those from the <a href="https://www.iedb.org/" target="_blank"> +Immune Epitope Database (IEDB)</a> and from +<a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=targets">a study of COVID+ patients</a>. +Of particular interest are the datasets describing surveys of antibody response across +a variety of SARS-CoV-2 variants in the receptor-binding domain +(<a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=abEscape">Antibody Escape Mutations</a>).</p> + </div> + + <div class="col-md-7"> +<p class="text-center"> + <a href="http://genome.ucsc.edu/s/dschmelt/covidBrowserIntroVars"><img class="text-center" + src="../../images/covidBrowserIntroVars.png" + alt="Some of the variation and immunology data on the Genome Browser" width="800" height="427"></a> + <p class="gbsCaption text-center">This image shows some of the variation data tracks + that can be displayed on the SARS-CoV-2 genome, specifically zoomed into the receptor-binding domain of + the Spike protein. Validated epitopes are displayed in black that may be a target for + therapeutic antibodies. In red and black, antibody escape scores are are shown for each + genome position. Smaller tick marks show amino acid or nucleotide changes from different sources, + with more information available by clicking into the item.</p> +</p> +</div> +</div> +<a name="usher"></a> +<h2>Genetic Contact Tracing with UShER</h2> +<p> +The UCSC Genome Browser has developed a tool that allows placement of SARS-CoV-2 +sequences onto existing phylogenetic trees far faster than previous methods, allowing +instantaneous tracing of strains and transmission events. This tool is called +<a href="../../cgi-bin/hgPhyloPlace">Ultrafast Sample placement on Existing tRees (UShER)</a> and +exists as an interactive web-tool to compare sequences and link to existing public phylogenetic +trees. +<p class="text-center"> + <a href="../../cgi-bin/hgPhyloPlace"><img class="text-center" + src="../../images/covidBrowserIntroUShER.png" + alt="Example of the UShER phylogeny placement tool" width="1200" + height="355"></a> + <p class="gbsCaption text-center">After uploading a Fasta file, the tool returns a page with quality +metrics such as: number of bases aligned, number of Ns, and number of maximally parsimonious +placements along with the lineage and clade of the nearest neighbor. Colored boxes highlight +possible quality issues, green meaning this was a high confidence placement.</p> + +<h3>SARS-CoV-2/ COVID Phylogenic Trees</h3> +<p> +You can view your aligned SARS-CoV-2 sequence genotypes along with their closest known +relatives among the 150,000+ public sequences. You can compare among your uploaded +samples or trace possible transmission vectors using mutational signatures.</p> + +<p class="text-center"> + <a href="http://genome.ucsc.edu/s/dschmelt/covidBrowserIntroUShER2"><img class="text-center" + src="../../images/covidBrowserIntroUShER2.png" + alt="Example of the UShER phylogeny placement tool tree features" width="800" + height="427"></a> + <p class="gbsCaption text-center">The uploaded sequences are highlighted in blue alongside +their most closely aligned public sequences. You can investigate genotypes and relationships +between samples.</p> +</p> + + +<a name="data"></a> +<h2>Other tools, downloads, and features</h2> +<h3>Custom Tracks, BLAT, Track Hubs</h3> +<p> +Along with a suite of data tracks, filters, and visualization options for the SARS-CoV-2 +genome, the UCSC Genome Browser offers many additional ways to interface with our data. +You can upload your data on the reference genome in +<a href="../../FAQ/FAQformat.html">nearly any format</a> with our +<a href="../../cgi-bin/hgCustom">Custom Track tool</a>. If you have unaligned sequence, + you can use our <a href="../../cgi-bin/hgBlat">BLAT +sequence alignment tool</a> to get coordinates and base-by-base comparison +with any reference genome. We also display formatted data +as <a href="hgTrackHubHelp.html">Track Hubs</a> and curate a list of +user-submitted <a href="../../cgi-bin/hgHubConnect?#publicHubs">Public Track Hubs</a>. +</p> +<h3>Downloads, Table Browser, JSON API, SQL</h3> +<p> +As part of our open-source, open-access philosophy, we try to make it as easy as possible +for researchers to download entire datasets or filtered subsets. Each track description page +has a Data Access section which points users to our main options for data download. +For downloading complete datasets, our +<a href="http://hgdownload.soe.ucsc.edu/downloads.html#SARS-CoV-2">SARS-CoV-2 download +directory</a> provides access to all our source files for transparency and reproducibility. +Our <a href="../../cgi-bin/hgTables">Table Browser</a> tool lets users interact with our +data using a variety of filters based on score, identifiers, or any other field. Table Browser +also allows users to convert data into multiple different formats (e.g. BED, GTF) and to access +different formatted sequence outputs (in FASTA format).</p> + +<p> +We have a <a href="api.html">JSON API</a> which can be programmatically called and return +any dataset in its entirety or as a filtered subset based on documented input parameter. We also +offer a <a href="mysql.html">Public SQL server</a> for similar flexible, automatic way to access +genomic data and annotations. Along with this particular virus genome browser, we have thousands of +genomes available for visualization and analysis from our <a href="../../cgi-bin/hgGateway"> +genome assemblies</a> gateway page. +</p> + +<a name="support"></a> +<h2>Support and Collaboration</h2> +<p> +The Genome Browser offers rapid email support for anything related to our tools. If your +question is general or may have been asked before, please review our +<a href="hgTracksHelp.html">Browser documentation</a> and our archive +of <a href="https://groups.google.com/u/1/a/soe.ucsc.edu/g/genome"> +previously answered questions</a>. If you would still like help, please go to +our <a href="../../contacts.html">Contact Us</a> page to see access our email support. When contacting us, +please include a session link, images, and example data if applicable. We are active on +social media, you can follow us on <a href="https://twitter.com/GenomeBrowser">Twitter</a> +or <a href="https://www.facebook.com/ucscGenomeBrowser">Facebook</a>.</p> + +<p> +We are always looking to collaborate with researchers and add new datasets to our site. +We also seek to continuously improve our tools to meet the needs of the scientific community. +If you have any collaboration ideas, contributions, or feature requests, please reach out through +our <a href="../../cgi-bin/hgUserSuggestion">suggestion page</a>.</p> + +<!--#include virtual="$ROOT/inc/gbPageEnd.html" -->