34d2eee845f5f45e571d1e153c632683b8a93f75 lrnassar Tue Apr 21 16:17:53 2026 -0700 Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737 Previously Rule 2 required exonCount==1 (truly intronless). This overcorrected for single-CDS-exon transcripts whose only introns are in the 5'UTR: biologically these have no EJC downstream of the stop codon (5'UTR EJCs are cleared by the scanning 40S or sit upstream of the terminating ribosome) and are NMD-immune, but the code pushed them to Rules 1/3 under a less accurate "last coding exon" label. New gate: len(cdsExons) == 1 AND no exon-exon junction strictly downstream of the stop codon (strand-aware). Transcripts with a single coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that intron deposits an EJC that can trigger NMD. 3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule 2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR introns correctly remain in Rules 1/3. Description page and makedoc updated. diff --git src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html index 2a7bc848ee2..b9207fc0d6f 100644 --- src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html +++ src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html @@ -8,33 +8,39 @@ <p> The following rules were applied to transcript annotations to define predicted NMD escape regions (Nagy et al, Trends Biochem Sci 1998 and Lindeboom et al, Nat Genet 2016): </p> <ol> <li><b>50 bp rule</b>: The entire last coding exon plus the last 50 bp of the penultimate coding exon. A PTC here has no downstream exon-exon junction (or is too close to the last one) for NMD to be triggered. Non-protein-coding 3' exons are not counted when identifying the last coding junction. Note: when the penultimate coding exon is shorter than 50 bp, the annotated region extends only to the upstream junction of that exon and does not walk further upstream. A small number of transcripts with unusually short penultimate coding exons are affected.</li> - <li><b>Intronless transcripts</b>: Transcripts with a single exon. Since no - EJCs are deposited on single-exon transcripts, all PTCs are predicted to - escape NMD.</li> + <li><b>No downstream EJC rule</b>: Transcripts with a single coding exon and + no 3′UTR intron. No exon-exon junction exists downstream of the stop + codon, so no EJC is deposited that could trigger NMD at a PTC. This + covers truly intronless transcripts as well as transcripts whose only + introns are in the 5′UTR (where EJCs are cleared by the scanning 40S + ribosomal subunit or sit upstream of the stop and are never encountered by + the terminating ribosome). Transcripts with a single coding exon but a + 3′UTR intron are excluded, because that intron deposits an EJC + downstream of the stop codon that can trigger NMD.</li> <li><b>Start-proximal region</b>: The first 100 bp of coding nucleotides. PTCs in this region do not lead to NMD, a phenomenon known as start-proximal NMD insensitivity. One proposed mechanism, supported by experimental evidence, is re-initiation of translation at a downstream AUG codon.</li> <li><b>Long exon rule</b>: Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule). Lindeboom et al. 2016 showed a marked drop in NMD efficiency (61% vs. 98%) for PTCs in exons longer than 400 nt, likely because the large distance between the stalled ribosome and the downstream EJC reduces UPF1-EJC contact.</li> </ol> <p> Non-coding transcripts (where CDS start equals CDS end) are excluded. Overlapping regions from multiple transcripts with identical coordinates and the same rule are collapsed into a single item, with the contributing @@ -63,58 +69,61 @@ </p> <p> However, PTCs located in the last coding exon or within approximately 50 bp upstream of the last exon-exon junction are too close to the final EJC (or have no downstream EJC at all) for NMD to be triggered—the transcript escapes degradation. Conversely, PTCs located more than 50–55 bp upstream of the last exon-exon junction are predicted to elicit NMD. </p> <p> Additional escape mechanisms, supported by Lindeboom et al. 2016 and other studies, are captured by three further rules: </p> <ul> - <li><b>Intronless transcripts</b> deposit no EJCs during splicing, so any - PTC escapes NMD.</li> + <li><b>Transcripts with no EJC downstream of the stop codon</b> (single coding + exon and no 3′UTR intron) cannot trigger NMD, so any PTC in the coding + sequence escapes. 5′UTR introns are tolerated because their EJCs are + upstream of the stop.</li> <li><b>Start-proximal PTCs</b> (within the first 100 bp of coding sequence) escape NMD, likely through translation re-initiation at a downstream AUG codon.</li> <li><b>PTCs in long coding exons</b> (>400 bp) show reduced NMD efficiency (61% vs. 98% for shorter exons in Lindeboom et al. 2016), likely because the large distance between the stalled ribosome and the downstream EJC reduces UPF1-EJC contact.</li> </ul> <h2>Display Conventions and Configuration</h2> <p> Regions from overlapping transcripts with the same coordinates are collapsed into a single item. The gene symbol is shown as the item name. Mouseover displays the NMD escape rule and the number of transcripts. The details page lists all contributing transcript IDs. </p> <p> Items are colored by the NMD escape rule that applies: </p> <ul> <li><font color="#FF0000"><b>Red</b></font> – Rule 1: Last 50 bp of the last coding exon-exon junction. A PTC here is too close to the last exon junction complex (EJC) for NMD to be triggered.</li> - <li><font color="#FF8C00"><b>Orange</b></font> – Rule 2: Intronless - (single-exon) transcript. No EJCs are deposited, so all PTCs escape NMD.</li> + <li><font color="#FF8C00"><b>Orange</b></font> – Rule 2: Single coding + exon and no 3′UTR intron. No EJC is deposited downstream of the stop + codon, so all PTCs in the coding sequence escape NMD.</li> <li><font color="#8B0000"><b>Dark red</b></font> – Rule 3: First 100 bp of coding nucleotides. PTCs in this start-proximal region are insensitive to NMD, possibly due to translation re-initiation at a downstream AUG codon.</li> <li><font color="#FFD700"><b>Gold</b></font> – Rule 4: Coding exons longer than 400 bp (excluding the last coding exon). NMD efficiency is reduced in these long exons because the PTC is far from the downstream exon-exon junction.</li> </ul> <h2>Data Access</h2> <p> The data underlying this track can be explored interactively with the <a href="../cgi-bin/hgTables">Table Browser</a> or the <a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For automated analysis, the data may be queried from our