src/hg/htdocs/FAQ/FAQgenes.html ed1738b3b8296a6b7e7376336f054eae0b2c88c6

ed1738b3b8296a6b7e7376336f054eae0b2c88c6
max
  Wed Jul 27 03:09:02 2022 -0700
adding clearer section header to genes faq, refs #29778

diff --git src/hg/htdocs/FAQ/FAQgenes.html src/hg/htdocs/FAQ/FAQgenes.html
index 52d3c97..b0d238a 100755
--- src/hg/htdocs/FAQ/FAQgenes.html
+++ src/hg/htdocs/FAQ/FAQgenes.html
@@ -323,47 +323,47 @@
 <a name="ncbiRefseq"></a>
 <h6>What is the difference between "NCBI RefSeq" and "UCSC RefSeq"?</h6>
 <p>
 RefSeq gene transcripts, unlike GENCODE/Ensembl/UCSC Genes, are sequences that can differ from 
 the genome. They need to be aligned to the genome to create annotations and UCSC
 and NCBI create alignments with different software (BLAT and splign, respectively).
 The advantages of the UCSC alignments are that
 they are updated constantly even for older assemblies, such as GRCh37/hg19.
 The advantage of NCBI alignments are that they are placed manually 
 to a chromosome location and are the official alignments, e.g. for databases and manuscripts.
 Therefore, we recommend working with the NCBI annotations and when an assembly has an &quot;NCBI RefSeq&quot; track, we show it by default and hide the
 &quot;UCSC RefSeq&quot; track. The only exception may be hg19 (see the note at the end of this section).
 </p>
 <p>The UCSC alignments can differ from the NCBI alignments for two reasons:</p>
 
-<p><b>Very similar transcripts:</b> 
+<p><b>Very similar transcripts resulting in transcript location swaps or duplicated transcripts:</b> 
 Let's take the case of two almost-identical transcripts sequences in RefSeq,
 with two genes in the genome where they could be placed.
 NCBI has a rule to place every transcript only once, and transcripts
 are manually tied to a chromosome band or location by NCBI, so each gene will get one
 and only one transcript of two. NCBI RefSeq will have two genes with one transcript each.
 UCSC RefSeq though places all
 transcripts where they align at very high identity, so both genes will get
-annotated with both transcripts. For example, 
+annotated with both transcripts, creating duplicates. For example, 
 the transcript NM_001012276 has three almost-identical possible
-placements to the genome in the UCSC RefSeq track, as it is entirely alignment-based without any manual filtering,
-but NM_001012276.3 is shown at a single location in the NCBI RefSeq track, as the NCBI
-software will only retain the alignment at the manually annotated location. It
-may be good to know about almost-identical alignments when doing genomic
-analysis or manual inspection of NGS read alignments, but for clinical
-reporting purposes or other automated analyses, we strongly recommend to use
-the NCBI RefSeq track.
+placements to the genome in the UCSC RefSeq track, as it is entirely alignment-based without any manual filtering.
+The same transcript NM_001012276.3 is shown at a single location in the NCBI
+RefSeq track, as the NCBI software will only retain the alignment at the
+manually annotated location. It may be good to know about almost-identical
+alignments when doing genomic analysis or manual inspection of NGS read
+alignments, but for clinical reporting purposes or other automated analyses, we
+strongly recommend to use the NCBI RefSeq track.
 </p>
 
 <p>
 <b>Unclear exon boundaries:</b> In some rare cases, the NCBI and UCSC exon boundaries differ.
 This happens especially when sequence deletions in the genome make the placement very difficult.
 Activating both RefSeq and UCSC RefSeq tracks helps you investigate the differences.
 Activating the RefSeq Alignments track shows NCBI's splign alignments in more detail,
 including double lines where both transcript and genomic sequence are skipped in the alignment.
 When available, the RefSeq Diffs subtrack may be helpful too. The upcoming <a target=_blank 
 href=https://ncbiinsights.ncbi.nlm.nih.gov/2018/10/11/matched-annotation-by-ncbi-and-embl-ebi-mane-a-new-joint-venture-to-define-a-set-of-representative-transcripts-for-human-protein-coding-genes/>MANE gene set</a> 
 will contain a set of high-quality transcripts that are 100%
 alignable to the genome and are part of both RefSeq and Ensembl/GENCODE but
 at the time of writing this project is at an early stage.
 </p>