4bd316f5f1ca47328bd3f9a181214b788055f0bc lrnassar Tue Apr 21 13:29:26 2026 -0700 NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks. diff --git src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html index 8398645cc67..2a7bc848ee2 100644 --- src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html +++ src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html @@ -1,160 +1,164 @@ <h2>Description</h2> <p> The <b>NMD escape ruleset</b> tracks show predicted regions where a premature termination codon (PTC) or frameshift variant is likely to cause the transcript to <em>escape</em> nonsense-mediated decay (NMD), leading to the production of an aberrant truncated protein rather than degradation of the mRNA. </p> <p> The following rules were applied to transcript annotations to define predicted NMD escape regions (Nagy et al, Trends Biochem Sci 1998 and Lindeboom et al, Nat Genet 2016): </p> <ol> <li><b>50 bp rule</b>: The entire last coding exon plus the last 50 bp of the penultimate coding exon. A PTC here has no downstream exon-exon junction (or is too close to the last one) for NMD to be triggered. Non-protein-coding 3' exons are not counted when identifying the last - coding junction.</li> + coding junction. Note: when the penultimate coding exon is shorter than + 50 bp, the annotated region extends only to the upstream junction of + that exon and does not walk further upstream. A small number of + transcripts with unusually short penultimate coding exons are affected.</li> <li><b>Intronless transcripts</b>: Transcripts with a single exon. Since no EJCs are deposited on single-exon transcripts, all PTCs are predicted to escape NMD.</li> <li><b>Start-proximal region</b>: The first 100 bp of coding nucleotides. PTCs in this region do not lead to NMD, a phenomenon known as start-proximal NMD insensitivity. One proposed mechanism, supported by experimental evidence, is re-initiation of translation at a downstream AUG codon.</li> <li><b>Long exon rule</b>: Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule). Lindeboom et al. 2016 showed a marked drop in NMD efficiency (61% vs. 98%) for PTCs in exons longer than 400 nt, likely because the large distance between the stalled ribosome and the downstream EJC reduces UPF1-EJC contact.</li> </ol> <p> Non-coding transcripts (where CDS start equals CDS end) are excluded. Overlapping regions from multiple transcripts with identical coordinates and the same rule are collapsed into a single item, with the contributing transcript IDs stored as a comma-separated list. </p> <p> Two versions of this track are available, based on different transcript annotation sets: </p> <ul> <li><b><a href="hgTrackUi?g=nmdEscGencode">NMD escape Gencode</a></b>: Derived from GENCODE V49 transcript annotations.</li> <li><b><a href="hgTrackUi?g=nmdEscNcbiRefSeq">NMD escape NCBI RefSeq</a></b>: - Derived from NCBI RefSeq transcript annotations.</li> + Derived from NCBI RefSeq Curated transcript annotations (NM_ and NR_ + accessions; predicted XM_/XR_ models are excluded).</li> </ul> <h2>Background</h2> <p> NMD escape regions were predicted based on the Exon Junction Complex (EJC)-dependent model of NMD. During normal translation, EJCs are deposited at exon-exon junctions after splicing. As the ribosome translates the mRNA, it displaces each EJC it encounters. When a PTC causes the ribosome to stall prematurely, any remaining downstream EJCs recruit surveillance factors (notably UPF1) that trigger mRNA degradation via NMD. </p> <p> However, PTCs located in the last coding exon or within approximately 50 bp upstream of the last exon-exon junction are too close to the final EJC (or have no downstream EJC at all) for NMD to be triggered—the transcript escapes degradation. Conversely, PTCs located more than 50–55 bp upstream of the last exon-exon junction are predicted to elicit NMD. </p> <p> Additional escape mechanisms, supported by Lindeboom et al. 2016 and other studies, are captured by three further rules: </p> <ul> <li><b>Intronless transcripts</b> deposit no EJCs during splicing, so any PTC escapes NMD.</li> <li><b>Start-proximal PTCs</b> (within the first 100 bp of coding sequence) escape NMD, likely through translation re-initiation at a downstream AUG codon.</li> <li><b>PTCs in long coding exons</b> (>400 bp) show reduced NMD efficiency (61% vs. 98% for shorter exons in Lindeboom et al. 2016), likely because the large distance between the stalled ribosome and the downstream EJC reduces UPF1-EJC contact.</li> </ul> <h2>Display Conventions and Configuration</h2> <p> Regions from overlapping transcripts with the same coordinates are collapsed into a single item. The gene symbol is shown as the item name. Mouseover displays the NMD escape rule and the number of transcripts. The details page lists all contributing transcript IDs. </p> <p> Items are colored by the NMD escape rule that applies: </p> <ul> <li><font color="#FF0000"><b>Red</b></font> – Rule 1: Last 50 bp of the last coding exon-exon junction. A PTC here is too close to the last exon junction complex (EJC) for NMD to be triggered.</li> <li><font color="#FF8C00"><b>Orange</b></font> – Rule 2: Intronless (single-exon) transcript. No EJCs are deposited, so all PTCs escape NMD.</li> <li><font color="#8B0000"><b>Dark red</b></font> – Rule 3: First 100 bp of coding nucleotides. PTCs in this start-proximal region are insensitive to NMD, possibly due to translation re-initiation at a downstream AUG codon.</li> <li><font color="#FFD700"><b>Gold</b></font> – Rule 4: Coding exons longer than 400 bp (excluding the last coding exon). NMD efficiency is reduced in these long exons because the PTC is far from the downstream exon-exon junction.</li> </ul> <h2>Data Access</h2> <p> The data underlying this track can be explored interactively with the <a href="../cgi-bin/hgTables">Table Browser</a> or the <a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For automated analysis, the data may be queried from our <a href="/goldenPath/help/api.html">REST API</a>. Please refer to our -<a href="https://groups.google.com/a/soe.ucsc.edu/forum/#!forum/genome">mailing -list archives</a> for questions, or our +<a href="https://groups.google.com/a/soe.ucsc.edu/forum/#!forum/genome" +target="_blank">mailing list archives</a> for questions, or our <a href="../FAQ/FAQdownloads.html#download36">Data Access FAQ</a> for more information. </p> <h2>Credits</h2> <p> Thanks to Guido Neidhardt for suggesting this track at HUGO VEPTC 2025 and Andreas Lahner for feedback. Thanks to the Decipher Genome Browser team for introducing the idea of a track. </p> <h2>References</h2> <p> Kurosaki T, Popp MW, Maquat LE. <a href="https://doi.org/10.1038/s41580-019-0126-2" target="_blank"> Quality and quantity control of gene expression by nonsense-mediated mRNA decay</a>. <em>Nat Rev Mol Cell Biol</em>. 2019 Jul;20(7):406-420. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/30992545" target="_blank">30992545</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6855384/" target="_blank">PMC6855384</a> </p> <p> Lindeboom RGH, Supek F, Lehner B. <a href="https://doi.org/10.1038/ng.3664" target="_blank"> The rules and impact of nonsense-mediated mRNA decay in human cancers</a>. <em>Nat Genet</em>. 2016 Oct;48(10):1112-8. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/27618451" target="_blank">27618451</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5045715/" target="_blank">PMC5045715</a> </p> <p> Nagy E, Maquat LE. <a href="https://linkinghub.elsevier.com/retrieve/pii/S0968-0004(98)01208-0" target="_blank"> A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance</a>. <em>Trends Biochem Sci</em>. 1998 Jun;23(6):198-9. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/9644970" target="_blank">9644970</a> </p>