0d25063d7b674c12310ad70056cbb898ed81ddbc
dschmelt
  Thu Mar 11 12:14:15 2021 -0800
final draft of the page refs #27173

diff --git src/hg/htdocs/goldenPath/help/covidBrowserIntro.html src/hg/htdocs/goldenPath/help/covidBrowserIntro.html
index 06290a5..4b233d6 100755
--- src/hg/htdocs/goldenPath/help/covidBrowserIntro.html
+++ src/hg/htdocs/goldenPath/help/covidBrowserIntro.html
@@ -1,41 +1,43 @@
 <!DOCTYPE html>
 <!--#set var="TITLE" value="COVID Genome Browser Intro" -->
 <!--#set var="ROOT" value="../.." -->
 <!--#include virtual="$ROOT/inc/gbPageStart.html" -->
 
 
 <h1>Introduction to the SARS-CoV-2 Genome Browser</h1>
  <div class="row">
   <div class="col-sm-6">
 <p>
 The UCSC Genome Browser is an open-source, interactive sequence visualization tool that 
 has been a cornerstone of genomics since we released the first human genome assembly 
-20 years ago; cited in more than 37,000 scientific articles and used by thousands of 
-researchers each day. It allows for cross-referencing of research, clinical, 
-and epidemiology data against the SARS-CoV-2 reference genome. This data is updated frequently 
-and new datasets are added as they become available.</p>
+20 years ago. Cited in more than 37,000 scientific articles and used by thousands of 
+researchers each day; it allows for cross-referencing of research, clinical, 
+and epidemiology data against the SARS-CoV-2 reference genome. This data is continuously 
+updated and added to as new datasets become available. For a more thorough description,
+please reference our <a href="https://www.nature.com/articles/s41588-020-0700-8" target="_blank">
+SARS-CoV-2 Genome Browser Nature Genetics paper</a>.</p>
 <p>
 This guide will go through some of the most important use cases of the SARS-CoV-2 Genome Browser. 
 These topics include:</p>
 <ul>
-<li><a href="nav">Orientation and Navigation</a></li>
-<li><a href="genes">Gene Data and Sequence Alignments</a></li>
-<li><a href="var">Variation and Immunology data</a></li>
-<li><a href="usher">Phylogeny Contact Tracing using USHER</a></li>
-<li><a href="data">Exporting bulk data</a></li>
-<li><a href="support">Support and Collaboration</a></li>
+<li><a href="#nav">Orientation and Navigation</a></li>
+<li><a href="#genes">Gene Data and Sequence Alignments</a></li>
+<li><a href="#var">Variation and Immunology data</a></li>
+<li><a href="#usher">Phylogenetic Contact Tracing using USHER</a></li>
+<li><a href="#data">Other tools and data downloads</a></li>
+<li><a href="#support">Support and Collaboration</a></li>
 </ul>
 <p>
 For those who prefer a video explanation, we also have the following tutorial:</p>
  </div>
  <div class="col-md-6">
 <p>
   <iframe width="560" height="315" src="https://www.youtube.com/embed/Ee6h0xyZDOM?rel=0" 
   frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" 
   allowfullscreen></iframe></p>
  </div>
 </div>
 
 
 <a name="nav"></a>
 <h2>Genome Browser Orientation and Navigation</h2>
@@ -46,57 +48,59 @@
 bases, navigation is essential to finding the information you want to see. Below is an example
 view of the Genome Browser with labeled sections highlighting the navigation, reference sequence, 
 annotations, and other available track datasets.</p>
 
 <p>
 Navigation controls at the top allow users to move left and right and to zoom. The search box 
 allows users to search for particular features or to move to exact genomic coordinates. The 
 RNA sequence is shown only when the view is sufficiently zoomed in. Annotations are shown 
 for data tracks that have been set to visible in the Available Tracks section at the bottom.
 Tracks can be configured with a right-click or by clicking on their name near the 
 bottom of the page.</p>
 
 <p class="text-center">
   <a href="http://genome.ucsc.edu/s/SARS_CoV2/Figure1"><img class="text-center" 
     src="../../images/covidBrowserIntroNav.png" 
-    alt="Labeled orientation to the Genome Browser" width="900" height="474"></a>
+    alt="Labeled orientation to the Genome Browser" width="1200" height="594"></a>
   <p class="gbsCaption text-center">This is a view of the SARS-CoV-2 Genome Browser with 
     labeled elements to help with orientation. Interact with this session by clicking on 
     the picture. To read the full caption, please go to our 
-    <a href="https://www.nature.com/articles/s41588-020-0700-8#Fig1">Nature Genetics paper</a>.
+    <a href="https://www.nature.com/articles/s41588-020-0700-8#Fig1" target="_blank">
+    Nature Genetics paper</a>.
   </p>
 </p>
 
 <a name="genes"></a>
 <h2>Genes and Sequence Alignments</h2>
 <div class="row">
   <div class="col-md-5">
 <p>
 Gene and protein annotations are organized by the contributor, most notably NCBI and UniProt. 
 Having multiple information sources allows a consensus to be formed among datasets.
 Like many viral genomes, molecular complexity arises from polyproteins rearranging, 
 generating ~29 protein products. Most notable among these is the S (spike) protein which defines
 coronaviruses and allows entry into our cell membranes. Additional tracks contain information such 
 as interactions between viral proteins and human proteins (protein interact), PDB structures, and 
 RNA structure annotations (Rangan RNA), and more.</p> 
 
 <p>
 Sequence alignments and conservation data are also available across the SARS-CoV-2 genome,
-from large-scale views to individual bases and amino acids. There are four main conservation
-tracks that compare sequence similarity of 44 bat coronaviruses, 119 vertebrate coronaviruses,
-PhyloCSF computed conservation scores, and alignments of 7 human coronaviruses. The tracks have 
-different displays depending on visibility mode and the number of bases on the screen.</p>
+from large-scale views to individual bases and amino acids. Four conservation
+tracks compare sequences with 44 bat coronaviruses, 119 vertebrate 
+coronaviruses, 7 human coronaviruses, and PhyloCSF computed conservation scores. 
+The tracks have different displays depending on visibility mode and the number of 
+bases on the screen.</p>
 
 <p>
 Datasets can be turned on by setting the dropdown 
 next to the data track name from &quot;hide&quot; to dense, squish, pack, or full. Then click 
 the <button>refresh</button> button to see these changes in effect.
 Clicking on a data track name will take you to a description with more information on the 
 dataset, display conventions, methods, and references. Clicking on a particular item 
 will take you to a page with complete information about that item and dataset.</p>
   </div>
 
 
   <div class="col-md-7">
 <p class="text-center">
   <a href="http://genome.ucsc.edu/s/dschmelt/covidBrowserIntroGenes"><img class="text-center" 
     src="../../images/covidBrowserIntroGenes.png" 
@@ -116,31 +120,31 @@
 <h2>Exploring Variation and Immunology Data</h2>
 <p>
 The SARS-CoV-2 Genome Browser displays data on variation within SARS-CoV-2
 from UniProt, GenBank, GISAID, Nexstrain, and other providers. These data cover global trends
 in SARS-CoV-2 variation among all available public sequences, with regional descriptions 
 available through clicking into a particular entry. A few of the most notable tracks under the
 &quot;Variation and Repeats&quot; section are the 
 <a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=sarsCov2PhyloPub">Phylogeny: Public track</a>, which shows a 
 continuously updating phylogenetic tree that clusters similar sequences, with the frequency of each
 mutation shown by the height of the bar at that particular base. Tools are provided to filter these 
 data to show only well-supported mutation calls, set thresholds for minor-allele frequency, and display 
 data for specific clades.</p>
 <p>
 Another track is the 
 spike protein mutations from community annotations, highlighted as amino acid changes. 
-The Genome Browser also has the 
+The Genome Browser has also has the 
 <a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=variantMuts">Variants of Concern track</a>, which 
 pinpoints each accumulated mutation that defines 4 strains of SARS-CoV-2 
 of particular concern, labeled based on lay terms (such as 'California variant') as well as 
 the using the lineage defined by the 
 <a href="https://github.com/cov-lineages/pangolin/" target="_blank">Pangolin software</a> 
 (such as 'B.1.1.7').</p>
 
 <p>
 The Genome Browser also provides 12 immunology datasets that can inform potential therapeutic 
 targets or public health risks. Protein epitopes are highlighted in the genome by multiple tracks, 
 including those from the <a href="https://www.iedb.org/" target="_blank">
 Immune Epitope Database (IEDB)</a> and from 
 <a href="../../cgi-bin/hgTrackUi?db=wuhCor1&g=targets">a study of COVID+ patients</a>.
 Of particular interest are the datasets describing surveys of antibody response across 
 a variety of SARS-CoV-2 variants in the receptor-binding domain 
@@ -148,57 +152,97 @@
 
 <p class="text-center">
   <a href="http://genome.ucsc.edu/s/dschmelt/covidBrowserIntroVars"><img class="text-center" 
     src="../../images/covidBrowserIntroVars.png" 
     alt="Some of the variation and immunology data on the Genome Browser" width="800" height="427"></a>
   <p class="gbsCaption text-center">This image is an example of some of the variation data tracks
   that can be displayed on the SARS-CoV-2 genome, zoomed into the receptor-binding domain (RBD) of 
   the Spike protein. Validated epitopes are displayed in black that may be a target for 
   therapeutic antibodies. In red and black, antibody escape scores are are shown for each 
   genome position. Smaller tick marks show amino acid or nucleotide changes from different sources,
   with more information available by clicking into the item.</p>
 
 <a name="usher"></a>
 <h2>Genetic Contact Tracing with UShER</h2>
 <p>
-The UCSC Genome Browser also has developed a tool that allows placement of SARS-CoV-2
+The UCSC Genome Browser has developed a tool that allows placement of SARS-CoV-2
 sequences onto existing phylogenetic trees far faster than previous methods, allowing 
 instantaneous tracing of strains and transmission events. This tool is called 
 <a href="../../cgi-bin/hgPhyloPlace">Ultrafast Sample placement on Existing tRees (UShER)</a> and
 exists as an interactive web-tool to compare sequences and link to existing public phylogenetic 
 trees.
 <p class="text-center">
   <a href="../../cgi-bin/hgPhyloPlace"><img class="text-center" 
     src="../../images/covidBrowserIntroUShER.png" 
     alt="Example of the UShER phylogeny placement tool" width="1000" 
     height="275"></a>
   <p class="gbsCaption text-center">After uploading a Fasta file, the tool returns a page with quality 
 metrics such as: number of bases aligned, number of Ns, and number of maximally parsimonious 
 placements along with the lineage and clade of the nearest neighbor. Colored boxes highlight 
 possible quality issues, green meaning this was a high confidence placement.</p>
 <p>
 Next, you can view your aligned SARS-CoV-2 sequence genotypes along with their closest known 
 relatives among the 150,000+ public sequences. You can look at compare among your uploaded 
 samples or trace possible transmission vectors using mutational signatures.</p>
 <p class="text-center">
-  <a href="./../s/dschmelt/covidBrowserIntroUShER2"><img class="text-center" 
+  <a href="http://genome.ucsc.edu/s/dschmelt/covidBrowserIntroUShER2"><img class="text-center" 
     src="../../images/covidBrowserIntroUShER2.png" 
     alt="Example of the UShER phylogeny placement tool tree features" width="800" 
     height="427"></a>
   <p class="gbsCaption text-center">The uploaded sequences are highlighted in blue alongside 
 their most closely aligned public sequences. You can investigate genotypes and relationships 
 between samples.</p>
 
-<h2>Custom Tracks, Downloads, API, and SQL features</h2>
+<a name="data"></a>
+<h2>Other tools and features</h2>
+<h3>Custom Tracks, BLAT, Track Hubs</h3>
 <p>
 Along with a suite of data tracks, filters, and visualization options for the SARS-CoV-2
 genome, the UCSC Genome Browser offers many additional ways to interface with our data. 
-You can upload your data on the reference genome in <a href="">nearly any format</a> with our
-<a href="">Custom Track tool</a>. If you have unaligned sequence, you can use our <a href="">BLAT
-sequence alignment tool</a> to get coordinates and basewise comparison with any reference genome.
- We have a <a href="">JSON API</a> which return.
-
+You can upload your data on the reference genome in 
+<a href="../../FAQ/FAQformat.html">nearly any format</a> with our
+<a href="../..//cgi-bin/hgCustom">Custom Track tool</a>. If you have unaligned sequence,
+ you can use our <a href="../../cgi-bin/hgBlat">BLAT
+sequence alignment tool</a> to get coordinates and base-by-base comparison 
+with any reference genome. We also display formatted data
+as <a href="hgTrackHubHelp.html">Track Hubs</a> and curate a list of 
+user-submitted <a href="../../cgi-bin/hgHubConnect?#publicHubs">Public Track Hubs</a>.
+</p>
+<h3>Downloads, Table Browser, JSON API, SQL</h3>
+<p>
+As part of our open-source, open-access philosophy, we try to make it as easy as possible 
+for researchers to download entire datasets or filtered subsets. Each track description page
+has a Data Access section which points users to our main options for data download.
+For downloading complete datasets, our 
+<a href="http://hgdownload.soe.ucsc.edu/downloads.html#SARS-CoV-2">SARS-CoV-2 download 
+directory</a> provides access to all our source files for transparency and reproducibility. 
+Our <a href="../../cgi-bin/hgTables">Table Browser</a> tool lets users interact with our
+data using a variety of filters based on score, identifiers, or any other field. Table Browser
+also allows users to convert data into multiple different formats (e.g. BED, GTF) and to access
+different formatted sequence outputs (in FASTA format).</p>
 
+<p>
+We have a <a href="api.html">JSON API</a> which can be programmatically called and return
+any dataset in its entirety or as a filtered subset based on documented input parameter. We also
+offer a <a href="mysql.html">Public SQL server</a> for similar flexible, automatic way to access
+genomic data and annotations.
+</p>
 
-<h2>Support Docs, Contact Us</h2>
+<a name="support"></a>
+<h2>Support and Collaboration</h2>
+<p>The Genome Browser offers rapid email support for anything related to our tools. If your 
+question is general or may have been asked before, please review our 
+<a href="hgTracksHelp.html">Browser documentation</a> and our archive 
+of <a href="https://groups.google.com/u/1/a/soe.ucsc.edu/g/genome">
+previously answered questions</a>. If you would still like help, please go to
+our <a href="../../contacts.html">Contact Us</a> page to see access our email support. When contacting us, 
+please include a session link, images, and example data if applicable. We are active on 
+social media, you can follow us on <a href="https://twitter.com/GenomeBrowser">Twitter</a>
+or <a href="https://www.facebook.com/ucscGenomeBrowser">Facebook</a>.
+</p>
 
+<p>
+We are always looking to collaborate with researchers and add new datasets to our site. 
+We also seek to continuously improve our tools to meet the needs of the scientific community. 
+If you have any collaboration ideas, contributions, or feature requests, please reach out through
+our <a href="../../cgi-bin/hgUserSuggestion">suggestion page</a>.</a>
 <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->