3a62ea7e9a8cb3503586a0a78570331308c9bc58 max Mon Apr 27 02:23:00 2026 -0700 NMD Escape MANE: expose NM_ accession via labelFields. refs #33737 Per QA, the MANE subtrack now shows the NCBI RefSeq accession by default instead of the HGNC gene symbol, with the ENST and gene symbol still selectable via labelFields. - genePredNmdEsc: new --ncbi-id-field N option (default -1 = unused). When set, the named bigGenePred column is captured per-transcript and written into a new ncbiIds output column. For MANE pass 21. - genePredNmdEsc: new --no-collapse option. By default, regions with identical (chrom, start, end, rule) from multiple transcripts collapse into one row with comma-separated lists. With --no-collapse the script emits one row per (transcript, region). Used for MANE so each label-field column holds a single value: the 74 MANE Plus Clinical genes (e.g. LMNA) get two rows per region instead of one row with a two-element list. - nmdEscCollapsed.as: add lstring ncbiIds column. Schema is now bed9+3. - nmd.ra (nmdEscMane only): labelFields ncbiIds,name,transcripts; defaultLabelFields ncbiIds; labelSeparator " / ". Gencode and RefSeq subtracks unchanged - they default to the gene symbol (name column) and have an empty ncbiIds column. - doc/hg38/nmd.txt: bump all three bedToBigBed invocations to bed9+3 and document the --ncbi-id-field 21 + --no-collapse invocation for MANE. Counts: MANE 68,028 (--no-collapse); Gencode 233,375; RefSeq 112,356. diff --git src/hg/makeDb/scripts/nmd/nmdEscCollapsed.as src/hg/makeDb/scripts/nmd/nmdEscCollapsed.as index 54c95a12c08..53837ff1e3e 100644 --- src/hg/makeDb/scripts/nmd/nmdEscCollapsed.as +++ src/hg/makeDb/scripts/nmd/nmdEscCollapsed.as @@ -1,15 +1,16 @@ table nmdEscCollapsed "NMD escape regions collapsed across overlapping transcripts" ( string chrom; "Chromosome (or contig, scaffold, etc.)" uint chromStart; "Start position in chromosome" uint chromEnd; "End position in chromosome" string name; "Gene symbol (falls back to transcript ID if no gene symbol is available)" uint score; "Score from 0-1000" char[1] strand; "+ or -" uint thickStart; "Start of where display should be thick" uint thickEnd; "End of where display should be thick" uint color; "RGB color: red=rule 1, orange=rule 2, dark red=rule 3, gold=rule 4" string mouseover; "Rule description and transcript count" lstring transcripts; "Comma-separated list of transcript IDs from which this region was derived" + lstring ncbiIds; "Comma-separated list of NCBI RefSeq accessions (NM_/NR_); populated for MANE only" )