6d7150d4b11cf16b7b58d4c1eae3dbf6d240dffd
max
  Wed Jan 31 09:25:51 2024 -0800
fixing track refs, email from M Hiller

diff --git src/hg/makeDb/trackDb/TOGAannotation.html src/hg/makeDb/trackDb/TOGAannotation.html
index 6237b21..0b73ea6 100644
--- src/hg/makeDb/trackDb/TOGAannotation.html
+++ src/hg/makeDb/trackDb/TOGAannotation.html
@@ -1,82 +1,85 @@
 <h2>Description</h2>
 <p>
 <b>TOGA</b>
 (<b>T</b>ool to infer <b>O</b>rthologs from <b>G</b>enome <b>A</b>lignments)
 is a homology-based method that integrates gene annotation, inferring
 orthologs and classifying genes as intact or lost.
 </p>
 
 <h2>Methods</h2>
 <p>
 As input, <b>TOGA</b> uses a gene annotation of a reference species
 (human/hg38 for mammals, chicken/galGal6 for birds) and
 a whole genome alignment between the reference and query genome.
 </p>
 <p>
 <b>TOGA</b> implements a novel paradigm that relies on alignments of intronic
 and intergenic regions and uses machine learning to accurately distinguish
 orthologs from paralogs or processed pseudogenes.
 </p>
 <p>
 To annotate genes,
 <a href="https://academic.oup.com/bioinformatics/article/33/24/3985/4095639"
 target="blank">CESAR 2.0</a>
 is used to determine the positions and boundaries of coding exons of a
 reference transcript in the orthologous genomic locus in the query species.
 </p>
 
 <h2>Display Conventions and Configuration</h2>
 <p>
 Each annotated transcript is shown in a color-coded classification as
 <ul>
 <li><span style='display:inline-block; width:40px; height:15px; background-color:blue;'>&nbsp;</span>
     <span style='color:blue'>"intact"</span>: middle 80% of the CDS
     (coding sequence) is present and exhibits no gene-inactivating mutation.
     These transcripts likely encode functional proteins.</li>
 <li><span style='display:inline-block; width:40px; height:15px; background-color:lightblue;'>&nbsp;</span>
     <span style='color:#7193a0'>"partially intact"</span>: 50% of the CDS
      is present in the query and the middle 80% of the CDS exhibits no
      inactivating mutation. These transcripts may also encode functional
      proteins, but the evidence is weaker as parts of the CDS are missing,
      often due to assembly gaps.</li>
 <li><span style='display:inline-block; width:40px; height:15px; background-color:grey;'>&nbsp;</span>
     <span style='color:grey'>"missing"</span>: &lt;50% of the CDS is present
      in the query and the middle 80% of the CDS exhibits no inactivating
      mutation.</li>
 <li><span style='display:inline-block; width:40px; height:15px; background-color:orange;'>&nbsp;</span>
     <span style='color:orange'>"uncertain loss"</span>: there is 1
      inactivating mutation in the middle 80% of the CDS, but evidence is not
      strong enough to classify the transcript as lost. These transcripts may
      or may not encode a functional protein.</li>
 <li><span style='display:inline-block; width:40px; height:15px; background-color:red;'>&nbsp;</span>
     <span style='color:red'>"lost"</span>: typically several inactivating
      mutations are present, thus there is strong evidence that the transcript
      is unlikely to encode a functional protein.</li>
 </ul>
 </p>
 <p>
 Clicking on a transcript provides additional information about the orthology
 classification, inactivating mutations, the protein sequence and protein/exon
 alignments.
 </p>
 
 <h2>Credits</h2>
 <p>
 This data was prepared by the <a href="https://tbg.senckenberg.de/hillerlab/"
 target="_blank">Michael Hiller Lab</a>
 </p>
 
 <h2>References</h2>
 <p>
 The <b>TOGA</b> software is available from
 <a href="https://github.com/hillerlab/TOGA"
 target="_blank">github.com/hillerlab/TOGA</a>
 </p>
 
 <p>
-Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales A,
-Ahmed AW, Kontopoulos DG, Hilgers L, Zoonomia Consortium, Hiller M.
-<a href="https://www.biorxiv.org/content/10.1101/2022.09.08.507143v1"
-target="_blank">TOGA integrates gene annotation with orthology inference
-at scale</a>. <em>bioRxiv preprint September 2022</em>
+Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales AE, Ahmed AW, Kontopoulos
+DG, Hilgers L <em>et al</em>.
+<a href="https://www.science.org/doi/abs/10.1126/science.abn3107?url_ver=Z39.88-2003&amp;rfr_id=ori:
+rid:crossref.org&amp;rfr_dat=cr_pub%20%200pubmed" target="_blank">
+Integrating gene annotation with orthology inference at scale</a>.
+<em>Science</em>. 2023 Apr 28;380(6643):eabn3107.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37104600" target="_blank">37104600</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193443/" target="_blank">PMC10193443</a>
 </p>