9a1f501b29a065739622c6c74e62e7d37a9dc4ba
max
  Thu Jul 9 06:15:32 2020 -0700
extending genes faq with mito description, refs #25845

diff --git src/hg/htdocs/FAQ/FAQgenes.html src/hg/htdocs/FAQ/FAQgenes.html
index 0b11753..01c7961 100755
--- src/hg/htdocs/FAQ/FAQgenes.html
+++ src/hg/htdocs/FAQ/FAQgenes.html
@@ -11,30 +11,31 @@
 
 <ul>
 <li><a href="#gene">What is a gene?</a></li>
 <li><a href="#genestrans">What is a transcript and how is it related to a gene?</a></li>
 <li><a href="#genename">What is a gene name?</a></li>
 <li><a href="#mostCommon">What are the most common gene transcript tracks?</a></li>
 <li><a href="#wrong">I think this transcript looks strange, what shall I do?</a></li>
 <li><a href="#ens">What are Ensembl and GENCODE and is there a difference?</a></li>
 <li><a href="#ensRefseq">What are the differences among GENCODE, Ensembl and RefSeq?</a></li>
 <li><a href="#hg19">For the human assembly hg19/GRCh37: What is the difference between "UCSC 
                     Genes" track, the "GENCODE" track and the "Ensembl Genes" track?</a></li>
 <li><a href="#hg38">For the human assembly hg38/GRCh38: What are the differences between the 
 		    "GENCODE" and "All GENCODE" tracks?</a></li>
 <li><a href="#gencode">What is the difference between GENCODE comprehensive and basic?</a></li>
 <li><a href="#ncbiRefseq">What is the difference between "NCBI RefSeq" and "UCSC RefSeq"?</a></li>
+<li><a href="#mito">What is the best gene track for mitochondrial gene annotations?</a></li>
 <li><a href="#report">How shall I report a gene transcript in a manuscript?</a></li>
 <li><a href="#ccds">What is CCDS?</a></li>
 <li><a href="#justsingle">How can I show a single transcript per gene?</a></li>
 <li><a href="#singledownload">How can I download a file with a single transcript per gene?</a></li>
 <li><a href="#whatdo">This is rather complicated. Can you tell me which gene transcript track
                       I should use?</a></li>
 <li><a href="#gtfDownload">Does UCSC provide GTF/GFF files for gene models?</a></li>
 </ul>
 <hr>
 <p>
 <a href="index.html">Return to FAQ Table of Contents</a></p>
 
 <a name="gene"></a>
 <h2>The basics</h2>
 
@@ -333,30 +334,51 @@
 
 <p>
 An anecdotal and rare example is SHANK2 and SHANK3 in hg19. It is impossible
 for either NCBI or BLAT to get the correct alignment and gene model because the genome sequence is
 missing for part of the gene.  NCBI and BLAT find slightly different exon
 boundaries at the edge of the problematic region. NCBI's aligner tries very hard
 to find exons that align to any transcript sequence,
 so it calls a few small dubious &quot;exons&quot; in the affected genomic region.
 GENCODE V19 also used an aligner that tried very hard to find exons, but it
 found small dubious &quot;exons&quot; in different places than NCBI.
 The <a target=_blank href="../cgi-bin/hgTrackUi?db=hg38&g=refSeqComposite">RefSeq Alignments</a> 
 subtrack makes the problematic region very clear with double lines
 indicating unalignable transcript sequence.
 </p>
 
+<a name="mito"></a>
+<h2>What is the best gene track for mitochondrial gene annotations</h2>
+<p>
+The mitochondrial sequence included in assembly sequence files is a very
+special genome and most of what has been explained on this page does not apply
+to the mitochondrial gene annotations. For most assemblies in the Genome
+Browser, the sequence name of the mitochondrial genome is "chrM".<p>
+<p>
+Sidenote: if you use hg19, note that hg19 in the UCSC Genome Browser had a chrM sequence
+that was not the mitochondrial genome sequence selected by NCBI later for GRCh37. This
+is why the current hg19 version of the Genome Browser contains two mitochondrial sequences,
+the old one called "chrM" and the one that is part of the GRCh37 reference, called "chrMT". The issue
+is described in detail in our <a target=_blank href="https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/README.txt">
+hg19 Download instructions</a>. If you use hg19 today, chrMT should be
+considered the mitochondrial genome, chrM is only supported for backwards
+compatibility and legacy annotation files. 
+<p>
+For chrM or chrMT (on hg19), the NCBI RefSeq or Ensembl/Gencode tracks contain the same gene annotations. 
+Both databases import their mitochondrial gene annotation directly from the rCRS RefSeq record <a target=_blank href="https://www.ncbi.nlm.nih.gov/nuccore/251831106">NC_012920.1</a>. The annotation was provided by <a target=_blank href="https://www.mitomap.org/MITOMAP">Mitomap.org</a>, which provides detailed documentation about the <a href="https://www.mitomap.org/foswiki/bin/view/MITOMAP/MitoSeqs" target=_blank>the history of this sequence</a>.
+
+
 <a name="report"></a>
 <h2>How shall I report a gene transcript in a manuscript?</h2>
 
 <p>
 When reporting on GENCODE/Ensembl transcripts, please specify the ENST
 identifier. It is often helpful to also specify the Ensembl release, 
 which is shown on the details page, when you click onto a transcript.
 </p>
 
 <p>
 When reporting RefSeq transcripts, e.g. in HGVS, prefer the "NCBI RefSeq" track
 over the "UCSC RefSeq track".  Please specify the RefSeq transcript ID and
 also the RefSeq annotation release.
 </p>