4bd316f5f1ca47328bd3f9a181214b788055f0bc lrnassar Tue Apr 21 13:29:26 2026 -0700 NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks. diff --git src/hg/makeDb/trackDb/human/hg38/nmdDetective.html src/hg/makeDb/trackDb/human/hg38/nmdDetective.html index 571b8133cca..10bcc67b28a 100644 --- src/hg/makeDb/trackDb/human/hg38/nmdDetective.html +++ src/hg/makeDb/trackDb/human/hg38/nmdDetective.html @@ -1,124 +1,124 @@
The NMDetective tracks display genome-wide predictions of nonsense-mediated mRNA decay (NMD) efficiency from Lindeboom et al. 2016. NMDetective scores predict whether a premature termination codon (PTC) at a given position will trigger NMD and mRNA degradation, or whether the transcript will escape NMD and potentially produce a truncated protein.
Scores range from approximately −1 to +1. Positive values indicate that a PTC at that position is predicted to trigger NMD (the mRNA is degraded). Negative values indicate that the PTC is predicted to escape NMD (the truncated mRNA may be translated into an aberrant protein). Values near zero indicate intermediate or uncertain NMD efficiency.
| Track | Description |
|---|---|
| NMDetective-A | Random forest model predicting NMD efficiency for all possible PTCs introduced by single-nucleotide variants. Explains ~71% of systematic variance in NMD efficiency. |
| NMDetective-B | Simplified decision tree model for all possible PTCs. Slightly lower accuracy (~68% variance explained) but more interpretable, making it suitable for clinical applications. |
| NMDetective-A PTC | Random forest model predicting NMD efficiency specifically for the first out-of-frame PTC introduced by frameshifting indel mutations. |
| NMDetective-B PTC | Decision tree model for the first out-of-frame PTC from frameshifting indels. |
Each subtrack is displayed as a signal (bigWig) track. By default, the vertical axis ranges from −1 to +1. Regions with positive values (predicted NMD-triggering) are shown above the baseline; regions with negative values (predicted NMD escape) are shown below.
The NMDetective models were trained on somatic nonsense mutation data from 9,769 cancer patients and validated with frameshift mutations and germline variants (Lindeboom et al. 2019). The models incorporate the following features to predict NMD efficiency:
NMDetective-A (random forest regression) captures non-linear interactions among these features and achieves the highest predictive accuracy. NMDetective-B (decision tree) applies a simpler rule-based classification that is more transparent, with a modest reduction in accuracy.
The predictions were generated for every possible PTC-introducing single-nucleotide variant and for the first out-of-frame PTC from every possible single-nucleotide frameshifting indel across all human protein-coding transcripts. The original bedGraph custom track files were downloaded from the NMDetective Figshare page resource and converted to bigWig format at UCSC.
The data underlying these tracks can be explored interactively with the Table Browser or the Data Integrator. For automated analysis, the data may be queried from our REST API. Please refer to our -mailing -list archives for questions, or our +mailing list archives for questions, or our Data Access FAQ for more information.
Thanks to Rik Lindeboom for providing custom tracks and the original NMDetective data on Figshare.
Lindeboom RG, Supek F, Lehner B. The rules and impact of nonsense-mediated mRNA decay in human cancers. Nat Genet. 2016 Oct;48(10):1112-8. PMID: 27618451; PMC: PMC5045715
Lindeboom RGH, Vermeulen M, Lehner B, Supek F. The impact of nonsense-mediated mRNA decay on genetic disease, gene editing and cancer immunotherapy. Nat Genet. 2019 Nov;51(11):1645-1651. PMID: 31659324; PMC: PMC6858879