4bd316f5f1ca47328bd3f9a181214b788055f0bc
lrnassar
  Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737

Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".

Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.

Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.

Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.

QA cleanups: non-ASCII prime char replaced with &#8242;, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.

Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.

diff --git src/hg/makeDb/trackDb/human/hg38/nmd.ra src/hg/makeDb/trackDb/human/hg38/nmd.ra
index 3c92e8c8ec8..7d89a5ede01 100644
--- src/hg/makeDb/trackDb/human/hg38/nmd.ra
+++ src/hg/makeDb/trackDb/human/hg38/nmd.ra
@@ -1,52 +1,40 @@
 track nmd
 shortLabel NMD Escape
 longLabel NMD Escape: Predicted regions where premature termination codons escape NMD
 group genes
 type bed 4
 visibility hide
 superTrack on
 pennantIcon New red ../goldenPath/newsarch.html#042226 "Released Apr. 22, 2026"
 
-        #track nmdGencode
-        #shortLabel NMD escape 50bp-100bp rule: Gencode transcripts
-        #longLabel NMD escape 50bp-100bp rule: Gencode transcripts
-        #parent nmd on
-        #bigDataUrl /gbdb/hg38/nmd/knownGeneNmdProt.bb
-        #visibility pack
-        #type bigBed
-        #dataVersion Gencode V49
-        #skipFields decoratedItem,style,fillColor,glyph,mouseover
-        #mouseOverField mouseover
-        #priority 1
-
         track nmdEscGencode
         shortLabel NMD Escape Gencode
         longLabel NMD escape 50bp/100bp/intronless/400nt ruleset: Gencode transcripts
         parent nmd on
         bigDataUrl /gbdb/hg38/nmd/nmdEscRegions.bb
         visibility dense
         type bigBed 9 +
         mouseOverField mouseover
         html nmdEscTranscripts
         # this could use a text file one day with the version
         dataVersion Gencode V49
         priority 1.5
 
         track nmdEscNcbiRefSeq
         shortLabel NMD Escape RefSeq
-        longLabel NMD escape 50bp/100bp/intronless/400nt ruleset: NCBI RefSeq transcripts
+        longLabel NMD escape 50bp/100bp/intronless/400nt ruleset: NCBI RefSeq Curated transcripts
         parent nmd on
         bigDataUrl /gbdb/hg38/nmd/nmdEscNcbiRefSeq.bb
         visibility pack
         type bigBed 9 +
         mouseOverField mouseover
         html nmdEscTranscripts
         # this could use a text file one day with the version
         dataVersion GCF_000001405.40-RS_2025_08
         priority 1.6
 
         track nmdDetectiveA
         shortLabel NMDetective-A
         longLabel NMDetective-A: Random forest prediction of NMD efficiency (Lindeboom 2016)
         parent nmd off
         bigDataUrl /gbdb/hg38/nmd/NMDetectiveA.bw