src/hg/makeDb/trackDb/wgEncodeGencodeDisplay1.shared.html 194efd4a95a8ac1704d5283bfd70fb616002bed8

194efd4a95a8ac1704d5283bfd70fb616002bed8
dschmelt
  Wed Jul 10 17:13:21 2019 -0700
Making html edits for gencode VM21 #23792

diff --git src/hg/makeDb/trackDb/wgEncodeGencodeDisplay1.shared.html src/hg/makeDb/trackDb/wgEncodeGencodeDisplay1.shared.html
index 7acb3ff..cc5b884 100644
--- src/hg/makeDb/trackDb/wgEncodeGencodeDisplay1.shared.html
+++ src/hg/makeDb/trackDb/wgEncodeGencodeDisplay1.shared.html
@@ -11,36 +11,38 @@
     <dd> The gene annotations in this view are divided into three subtracks:</dd>
 </dl>
 <ul>
   <li><em>GENCODE Basic set</em> is a subset of the <em>Comprehensive set</em>. 
     The selection criteria are described in the <a href="#basicSetSelection">methods section</a>.</li>
   <li><em>GENCODE Comprehensive set</em> contains all GENCODE coding and non-coding transcript annotations,
     including polymorphic pseudogenes.  This includes both manual and
     automatic annotations.  This is a super-set of the <em>Basic set</em>.</li>
   <li><em>GENCODE Pseudogenes</em> include all annotations except polymorphic pseudogenes.</li>
 </ul>
     
 <dl>
     <dt><i>2-way</i></dt> 
 </dl>
 <ul>
-    <li><em>GENCODE 2-way Pseudogenes</em> contains pseudogenes predicted by both the Yale
-        Pseudopipe and UCSC Retrofinder pipelines. 
-        The set was derived by looking for 50 base pairs
+    <li><em>GENCODE 2-way Pseudogenes</em> contains pseudogenes predicted by both the 
+        <a href="https://academic.oup.com/bioinformatics/article-abstract/22/12/1437/207326">Yale
+        PseudoPipe</a> and
+        <a href="https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-9-466">
+        UCSC RetroFinder</a> pipelines. The set was derived by looking for 50 base pairs
         of overlap between pseudogenes derived from both sets based on their 
-        chromosomal coordinates.  When multiple Pseudopipe
-        predictions map to a single Retrofinder prediction, only one match is kept
+        chromosomal coordinates.  When multiple PseudoPipe
+        predictions map to a single RetroFinder prediction, only one match is kept
         for the 2-way consensus set.
     </li>
 </ul>
 
 <dl>
     <dt><i>PolyA</i></dt>
 </dl>
 <ul>
 <li><em>GENCODE PolyA</em> contains polyA signals and sites manually annotated on
     the genome based on transcribed evidence (ESTs and cDNAs) of 3' end of
     transcripts containing at least 3 A's not matching the genome.</li>
 </ul>
 
 
 <p><b>Filtering</b> is available for the items in the GENCODE Basic, Comprehensive and Pseudogene tracks
@@ -59,31 +61,31 @@
 
   <li> Transcript Annotation Method: filter by the method used to create the annotation
    <ul>
      <li> All - don't filter by transcript class</li>
      <li> manual - display manually created annotations, including those that are 
        also created automatically</li>
      <li> automatic - display automatically created annotations, including those that are 
        also created manually</li>
      <li> manual_only - display manually created annotations that were
        not annotated by the automatic method</li>
      <li> automatic_only - display automatically created annotations that were
        not annotated by the manual method</li>
    </ul>
    </li>
   <li> Transcript Biotype: filter transcripts by
-       <a href="http://www.gencodegenes.org/gencode_biotypes.html" target="_blank">biotype</a></li>
+       <a href="https://www.gencodegenes.org/pages/biotypes.html" target="_blank">Biotype</a></li>
   <li> Support Level: filter transcripts by <a href="#tsl">transcription support level</a></li>
 </ul>
 
 <p><b>Coloring</b> for the gene annotations is based on the annotation type: </p>
 <ul>
   <li><font color="#0c0c78"><b>coding</b></font> 
   <li><font color="#006400"><b>non-coding</b></font> 
   <li><font color="#ff33ff"><b>pseudogene</b></font> 
   <li><font color="#fe0000"><b>problem</b></font>
   <li><font color="#ff33ff"><b>all 2-way pseudogenes</b></font>
   <li><font color="#000000"><b>all polyA annotations</b></font>
 </ul>
 
 <h2>Methods</h2>
 
@@ -100,53 +102,53 @@
 
 <p>
 <b><a name="basicSetSelection">GENCODE <em>Basic Set</em> selection:</a></b>
 The GENCODE <em>Basic Set</em> is intended to provide a simplified subset of
 the GENCODE transcript annotations that will be useful to the majority of
 users. The goal was to have a high-quality basic set that also covered all loci.  
 Selection of GENCODE annotations for inclusion in the <em>basic set</em>
 was determined independently for the coding and non-coding transcripts at each
 gene locus.
 </p>
 <ul>
   <li> Criteria for selection of coding transcripts (including polymorphic pseudogenes) at a given
        locus:
     <ul>
       <li> All full-length coding transcripts (except problem transcripts or transcripts that are
-           nonsense-mediated decay) was included in the basic set.</li>
+           nonsense-mediated decay) were included in the basic set.</li>
       <li> If there were no transcripts meeting the above criteria, then the partial coding
            transcript with the largest CDS was included in the basic set (excluding problem transcripts).</li>
     </ul>
   </li>
   <li> Criteria for selection of non-coding transcripts at a given locus:
     <ul>
       <li> All full-length non-coding transcripts (except problem transcripts)
-           with a well characterized biotype (see below) were included in the
+           with a well characterized Biotype (see below) were included in the
            basic set.</li>
       <li> If there were no transcripts meeting the above criteria, then the largest non-coding
            transcript was included in the basic set (excluding problem transcripts).</li>
     </ul>
   </li>
-  <li> If no transcripts were included by either the above criteria, the longest
+  <li> If no transcripts were included by either of the above criteria, the longest
     problem transcript is included.
   </li>
 </ul>
 
 <P>
 <b>Non-coding transcript categorization:</b> 
 Non-coding transcripts are categorized using
-their <a href="http://www.gencodegenes.org/gencode_biotypes.html" target="_blank">biotype</a>
+their <a href="http://www.gencodegenes.org/gencode_biotypes.html" target="_blank">Biotype</a>
 and the following criteria:
 </p>
 <ul>
   <li> well characterized: <em>antisense, Mt_rRNA, Mt_tRNA, miRNA, rRNA, snRNA, snoRNA</em></li>
   <li> poorly characterized: <em>3prime_overlapping_ncrna, lincRNA, misc_RNA, non_coding, processed_transcript, sense_intronic, sense_overlapping</em></li>
 </ul>
 
 <p>
 <b><a name="tsl">Transcription Support Level (TSL):</a></b>
 It is important that users understand how to assess transcript annotations
 that they see in GENCODE. While some transcript models have a high level of
 support through the full length of their exon structure, there are also
 transcripts that are poorly supported and that should be considered
 speculative. The Transcription Support Level (TSL) is a method to highlight the
 well-supported and poorly-supported transcript models for users. The method