4bd316f5f1ca47328bd3f9a181214b788055f0bc lrnassar Tue Apr 21 13:29:26 2026 -0700 NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks. diff --git src/hg/makeDb/trackDb/human/hg38/nmd.html src/hg/makeDb/trackDb/human/hg38/nmd.html index 53369d39588..8cf085ee766 100644 --- src/hg/makeDb/trackDb/human/hg38/nmd.html +++ src/hg/makeDb/trackDb/human/hg38/nmd.html @@ -1,119 +1,120 @@ <h2>Description</h2> <p> NMD is a cellular quality control mechanism that detects and degrades mRNAs containing premature termination codons (PTCs), preventing the accumulation of truncated, potentially harmful proteins. However, not all PTCs trigger NMD. PTCs in certain regions of a transcript are predicted to escape NMD, meaning the truncated mRNA may be translated into a protein with unpredictable functional consequences. The <b>NMD Escape</b> container includes several tracks that display putative regions where PTC variants are assumed to escape the NMD mechanism. These are typically located close to the first or last splice junction, within unusually long coding exons, or in transcripts without any junction. </p> <h2>Subtracks</h2> <h3>NMD escape regions</h3> <p> Rule-based predictions of NMD escape regions, computed from transcript annotations. Two transcript sets are provided: </p> <ul> <li><b><a href="hgTrackUi?g=nmdEscGencode">NMD escape Gencode</a></b>: NMD escape regions derived from GENCODE V49 transcripts.</li> <li><b><a href="hgTrackUi?g=nmdEscNcbiRefSeq">NMD escape NCBI RefSeq</a></b>: - NMD escape regions derived from NCBI RefSeq transcripts.</li> + NMD escape regions derived from NCBI RefSeq Curated transcripts + (NM_ and NR_ accessions only).</li> </ul> <p> Click either of the links to the track details here or above to show the four rules that were used (50bp, intronless, 100bp, long exon >400nt). </p> <h3>NMDetective scores</h3> <p> Machine-learning predictions of NMD efficiency from <a href="https://www.ncbi.nlm.nih.gov/pubmed/27618451" target="_blank">Lindeboom et al. 2016</a>. Two models (A = random forest, B = decision tree) predict whether a PTC at each position will trigger NMD or allow escape. Positive scores indicate predicted NMD triggering; negative scores indicate predicted escape. </p> <ul> <li><b><a href="hgTrackUi?g=nmdDetectiveA">NMDetective-A</a></b>: Random forest model for all possible PTCs from nonsense variants.</li> <li><b><a href="hgTrackUi?g=nmdDetectiveB">NMDetective-B</a></b>: Decision tree model for all possible PTCs from nonsense variants.</li> <li><b><a href="hgTrackUi?g=nmdDetectiveA_ptc">NMDetective-A PTC</a></b>: Random forest model for the first out-of-frame PTC from frameshifting indels.</li> <li><b><a href="hgTrackUi?g=nmdDetectiveB_ptc">NMDetective-B PTC</a></b>: Decision tree model for the first out-of-frame PTC from frameshifting indels.</li> </ul> <h2>Background</h2> <p> The ACMG guidelines say under PVS1: </p> <p> <i> -(ii) One must also be cautious when interpreting truncating variants downstream of the most 3′ truncating variant established as pathogenic in the literature. This is especially true if the predicted stop codon occurs in the last exon or in the last 50 base pairs of the penultimate exon, such that nonsense-mediated decay would not be predicted, and there is a higher likelihood of an expressed protein. +(ii) One must also be cautious when interpreting truncating variants downstream of the most 3′ truncating variant established as pathogenic in the literature. This is especially true if the predicted stop codon occurs in the last exon or in the last 50 base pairs of the penultimate exon, such that nonsense-mediated decay would not be predicted, and there is a higher likelihood of an expressed protein. </i> </p> <h2>Data Access</h2> <p> The data underlying these tracks can be explored interactively with the <a href="../cgi-bin/hgTables">Table Browser</a> or the <a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For automated analysis, the data may be queried from our <a href="/goldenPath/help/api.html">REST API</a>. Please refer to our -<a href="https://groups.google.com/a/soe.ucsc.edu/forum/#!forum/genome">mailing -list archives</a> for questions, or our +<a href="https://groups.google.com/a/soe.ucsc.edu/forum/#!forum/genome" +target="_blank">mailing list archives</a> for questions, or our <a href="../FAQ/FAQdownloads.html#download36">Data Access FAQ</a> for more information. </p> <h2>Credits</h2> <p> Thanks to Guido Neidhardt for suggesting this track at HUGO VEPTC 2025 and Andreas Lahner for feedback. Thanks to the Decipher Genome Browser team for introducing the idea of a track. Thanks to Rik Lindeboom for providing custom tracks. </p> <h2>References</h2> <p> Kurosaki T, Popp MW, Maquat LE. <a href="https://doi.org/10.1038/s41580-019-0126-2" target="_blank"> Quality and quantity control of gene expression by nonsense-mediated mRNA decay</a>. <em>Nat Rev Mol Cell Biol</em>. 2019 Jul;20(7):406-420. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/30992545" target="_blank">30992545</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6855384/" target="_blank">PMC6855384</a> </p> <p> Lindeboom RGH, Supek F, Lehner B. <a href="https://doi.org/10.1038/ng.3664" target="_blank"> The rules and impact of nonsense-mediated mRNA decay in human cancers</a>. <em>Nat Genet</em>. 2016 Oct;48(10):1112-8. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/27618451" target="_blank">27618451</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5045715/" target="_blank">PMC5045715</a> </p> <p> Lindeboom RGH, Vermeulen M, Lehner B, Supek F. <a href="https://doi.org/10.1038/s41588-019-0517-5" target="_blank"> The impact of nonsense-mediated mRNA decay on genetic disease, gene editing and cancer immunotherapy</a>. <em>Nat Genet</em>. 2019 Nov;51(11):1645-1651. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/31659324" target="_blank">31659324</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6858879/" target="_blank">PMC6858879</a> </p> <p> Nagy E, Maquat LE. <a href="https://linkinghub.elsevier.com/retrieve/pii/S0968-0004(98)01208-0" target="_blank"> A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance</a>. <em>Trends Biochem Sci</em>. 1998 Jun;23(6):198-9. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/9644970" target="_blank">9644970</a> </p>