18327e879f9df894313dd3100084c94cc6dd4136 hiram Fri Dec 5 11:03:40 2025 -0800 track definitions for TOGA version 2 refs #35776 diff --git src/hg/makeDb/trackDb/TOGAv2.html src/hg/makeDb/trackDb/TOGAv2.html new file mode 100644 index 00000000000..88e73527fc1 --- /dev/null +++ src/hg/makeDb/trackDb/TOGAv2.html @@ -0,0 +1,85 @@ +<h2>Description</h2> +<p> +<b>TOGA version 2.0</b> +(<b>T</b>ool to infer <b>O</b>rthologs from <b>G</b>enome <b>A</b>lignments) +is a homology-based method that integrates gene annotation, inferring +orthologs and classifying genes as intact or lost. +</p> + +<h2>Methods</h2> +<p> +As input, <b>TOGA</b> uses a gene annotation of a reference species +(human/hg38 for mammals, chicken/galGal6 for birds) and +a whole genome alignment between the reference and query genome. +</p> +<p> +<b>TOGA</b> implements a novel paradigm that relies on alignments of intronic +and intergenic regions and uses machine learning to accurately distinguish +orthologs from paralogs or processed pseudogenes. +</p> +<p> +To annotate genes, +<a href="https://academic.oup.com/bioinformatics/article/33/24/3985/4095639" +target="blank">CESAR 2.0</a> +is used to determine the positions and boundaries of coding exons of a +reference transcript in the orthologous genomic locus in the query species. +</p> + +<h2>Display Conventions and Configuration</h2> +<p> +Each annotated transcript is shown in a color-coded classification as +<ul> +<li><span style='display:inline-block; width:40px; height:15px; background-color:blue;'> </span> + <span style='color:blue'>"intact"</span>: middle 80% of the CDS + (coding sequence) is present and exhibits no gene-inactivating mutation. + These transcripts likely encode functional proteins.</li> +<li><span style='display:inline-block; width:40px; height:15px; background-color:lightblue;'> </span> + <span style='color:#7193a0'>"partially intact"</span>: 50% of the CDS + is present in the query and the middle 80% of the CDS exhibits no + inactivating mutation. These transcripts may also encode functional + proteins, but the evidence is weaker as parts of the CDS are missing, + often due to assembly gaps.</li> +<li><span style='display:inline-block; width:40px; height:15px; background-color:grey;'> </span> + <span style='color:grey'>"missing"</span>: <50% of the CDS is present + in the query and the middle 80% of the CDS exhibits no inactivating + mutation.</li> +<li><span style='display:inline-block; width:40px; height:15px; background-color:orange;'> </span> + <span style='color:orange'>"uncertain loss"</span>: there is 1 + inactivating mutation in the middle 80% of the CDS, but evidence is not + strong enough to classify the transcript as lost. These transcripts may + or may not encode a functional protein.</li> +<li><span style='display:inline-block; width:40px; height:15px; background-color:red;'> </span> + <span style='color:red'>"lost"</span>: typically several inactivating + mutations are present, thus there is strong evidence that the transcript + is unlikely to encode a functional protein.</li> +</ul> +</p> +<p> +Clicking on a transcript provides additional information about the orthology +classification, inactivating mutations, the protein sequence and protein/exon +alignments. +</p> + +<h2>Credits</h2> +<p> +This data was prepared by the <a href="https://tbg.senckenberg.de/hillerlab/" +target="_blank">Michael Hiller Lab</a> +</p> + +<h2>References</h2> +<p> +The <b>TOGA</b> software is available from +<a href="https://github.com/hillerlab/TOGA" +target="_blank">github.com/hillerlab/TOGA</a> +</p> + +<p> +Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales AE, Ahmed AW, Kontopoulos +DG, Hilgers L <em>et al</em>. +<a href="https://www.science.org/doi/abs/10.1126/science.abn3107?url_ver=Z39.88-2003&rfr_id=ori: +rid:crossref.org&rfr_dat=cr_pub%20%200pubmed" target="_blank"> +Integrating gene annotation with orthology inference at scale</a>. +<em>Science</em>. 2023 Apr 28;380(6643):eabn3107. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37104600" target="_blank">37104600</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193443/" target="_blank">PMC10193443</a> +</p>