src/hg/htdocs/FAQ/FAQgenes.html 59bde4b3cbc32dfab310226c474cf5dabc1dd383

59bde4b3cbc32dfab310226c474cf5dabc1dd383
max
  Wed Dec 4 05:30:07 2019 -0800
adding help on refseq annotation release, refs #24574

diff --git src/hg/htdocs/FAQ/FAQgenes.html src/hg/htdocs/FAQ/FAQgenes.html
index 72fbb88..6fbc91b 100755
--- src/hg/htdocs/FAQ/FAQgenes.html
+++ src/hg/htdocs/FAQ/FAQgenes.html
@@ -10,30 +10,31 @@
 <h2>Topics</h2>
 
 <ul>
 <li><a href="#gene">What is a gene?</a></li>
 <li><a href="#genestrans">What is a transcript and how is it related to a gene?</a></li>
 <li><a href="#genename">What is a gene name?</a></li>
 <li><a href="#mostCommon">What are the most common gene transcript tracks?</a></li>
 <li><a href="#ens">What are Ensembl and GENCODE and is there a difference?</a></li>
 <li><a href="#ensRefseq">What are the differences among GENCODE, Ensembl and RefSeq?</a></li>
 <li><a href="#hg19">For the human assembly hg19/GRCh37: What is the difference between "UCSC 
                     Genes" track, the "GENCODE" track and the "Ensembl Genes" track?</a></li>
 <li><a href="#hg38">For the human assembly hg38/GRCh38: What are the differences between the 
 		    "GENCODE" and "All GENCODE" tracks?</a></li>
 <li><a href="#gencode">What is the difference between GENCODE comprehensive and basic?</a></li>
 <li><a href="#ncbiRefseq">What is the difference between "NCBI RefSeq" and "UCSC RefSeq"?</a></li>
+<li><a href="#report">How shall I report a gene transcript in a manuscript?</a></li>
 <li><a href="#ccds">What is CCDS?</a></li>
 <li><a href="#justsingle">How can I show a single transcript per gene?</a></li>
 <li><a href="#singledownload">How can I download a file with a single transcript per gene?</a></li>
 <li><a href="#whatdo">This is rather complicated. Can you tell me which gene transcript track
                       I should use?</a></li>
 </ul>
 <hr>
 <p>
 <a href="index.html">Return to FAQ Table of Contents</a></p>
 
 <a name="gene"></a>
 <h2>The basics</h2>
 
 The genome browser contains many gene annotation tracks. Our users 
 often wonder what these contain and where the information that we present comes
@@ -319,39 +320,65 @@
 
 <p>
 An anecdotal and rare example is SHANK2 and SHANK3 in hg19. It is impossible
 for either NCBI or BLAT to get the correct alignment and gene model because the genome sequence is
 missing for part of the gene.  NCBI and BLAT find slightly different exon
 boundaries at the edge of the problematic region. NCBI's aligner tries very hard
 to find exons that align to any transcript sequence,
 so it calls a few small dubious &quot;exons&quot; in the affected genomic region.
 GENCODE V19 also used an aligner that tried very hard to find exons, but it
 found small dubious &quot;exons&quot; in different places than NCBI.
 The <a target=_blank href="../cgi-bin/hgTrackUi?db=hg38&g=refSeqComposite">RefSeq Alignments</a> 
 subtrack makes the problematic region very clear with double lines
 indicating unalignable transcript sequence.
 </p>
 
+<a name="report"></a>
+<h6>How shall I report a gene transcript in a manuscript?</h6>
+
+<p>
+When reporting on GENCODE/Ensembl transcripts, please specify the ENST
+identifier. It is often helpful to also specify the Ensembl release, 
+which is shown on the details page, when you click onto a transcript.
+</p>
+
 <p>
-When reporting results as RefSeq coordinates, e.g. as HGVS, in research
-articles, please specify the RefSeq annotation release and also the 
-RefSeq transcript ID with version (e.g. NM_012309.4 not NM_012309).
-Different RefSeq transcript versions have different sequence (for example,
-more sequence may be added to the UTRs or even the CDS), and so the transcript coordinates
-often change from one version to the next.
+When reporting RefSeq transcripts, e.g. in HGVS, prefer the "NCBI RefSeq" track
+over the "UCSC RefSeq track".  Please specify the RefSeq transcript ID and
+also the RefSeq annotation release.
 </p>
 
+<ul>
+<li>The RefSeq transcript ID is the sequence of the transcript, the NM_xxxxx.y
+accession. The version is separated with a dot.  Different RefSeq transcript
+versions have different sequences (for example, more sequence may be added to
+the UTRs or even the CDS), and so the transcript coordinates can change from
+one version to the next, which is why reporting the version of the transcript
+is helpful for readers, e.g. report NM_012309.4, not NM_012309.
+<li>The RefSeq annotation release captures the mapping of all transcript
+sequences to the genome.  It is shown on our transcript details page, when you
+click a transcript. It looks like "Annotation Release 105 (2017-04-01)".  The
+most important part is the "Annotation Release" number, e.g. "105". The date is
+NCBI's release date. Shown below this line is the date when UCSC imported the
+data, which is not relevant for manuscripts. Note that an "Annotation release"
+is not a "RefSeq release" , a "RefSeq release" is only about sequences, not
+their mapping to the genome. NCBI provides a list of 
+<a href="https://www.ncbi.nlm.nih.gov/genome/annotation_euk/all/"
+    target=_blank>all current annotation releases</a>. The first annotation
+    release for every genome is usually "100".
+</ul>
+
 <a name="ccds"></a>
 <h6>What is CCDS?</h6>
 <p>
 The <a target=_blank href="https://www.ncbi.nlm.nih.gov/projects/CCDS/CcdsBrowse.cgi">
 Consensus Coding Sequence Project</a> is a list of transcript coding sequence (CDS) genomic regions
 that are identically annotated by RefSeq and Ensembl/GENCODE.   CCDS undergoes extensive manual
 review and you can consider these a subset of either gene track, filtered for high quality.
 The CCDS identifiers  are very stable and allow you to link easily between the different databases.
 As  the name implies, it does not cover UTR regions or non-coding transcripts.
 </p>
 
 <a name="justsingle"></a>
 <h6>How can I show a single transcript per gene?</h6>
 
 <p>