4bd316f5f1ca47328bd3f9a181214b788055f0bc lrnassar Tue Apr 21 13:29:26 2026 -0700 NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks. diff --git src/hg/makeDb/trackDb/human/hg38/nmd.ra src/hg/makeDb/trackDb/human/hg38/nmd.ra index 3c92e8c8ec8..7d89a5ede01 100644 --- src/hg/makeDb/trackDb/human/hg38/nmd.ra +++ src/hg/makeDb/trackDb/human/hg38/nmd.ra @@ -1,52 +1,40 @@ track nmd shortLabel NMD Escape longLabel NMD Escape: Predicted regions where premature termination codons escape NMD group genes type bed 4 visibility hide superTrack on pennantIcon New red ../goldenPath/newsarch.html#042226 "Released Apr. 22, 2026" - #track nmdGencode - #shortLabel NMD escape 50bp-100bp rule: Gencode transcripts - #longLabel NMD escape 50bp-100bp rule: Gencode transcripts - #parent nmd on - #bigDataUrl /gbdb/hg38/nmd/knownGeneNmdProt.bb - #visibility pack - #type bigBed - #dataVersion Gencode V49 - #skipFields decoratedItem,style,fillColor,glyph,mouseover - #mouseOverField mouseover - #priority 1 - track nmdEscGencode shortLabel NMD Escape Gencode longLabel NMD escape 50bp/100bp/intronless/400nt ruleset: Gencode transcripts parent nmd on bigDataUrl /gbdb/hg38/nmd/nmdEscRegions.bb visibility dense type bigBed 9 + mouseOverField mouseover html nmdEscTranscripts # this could use a text file one day with the version dataVersion Gencode V49 priority 1.5 track nmdEscNcbiRefSeq shortLabel NMD Escape RefSeq - longLabel NMD escape 50bp/100bp/intronless/400nt ruleset: NCBI RefSeq transcripts + longLabel NMD escape 50bp/100bp/intronless/400nt ruleset: NCBI RefSeq Curated transcripts parent nmd on bigDataUrl /gbdb/hg38/nmd/nmdEscNcbiRefSeq.bb visibility pack type bigBed 9 + mouseOverField mouseover html nmdEscTranscripts # this could use a text file one day with the version dataVersion GCF_000001405.40-RS_2025_08 priority 1.6 track nmdDetectiveA shortLabel NMDetective-A longLabel NMDetective-A: Random forest prediction of NMD efficiency (Lindeboom 2016) parent nmd off bigDataUrl /gbdb/hg38/nmd/NMDetectiveA.bw