File Changes for lrnassar

switch to commits view, user index

v497_base to v498_preview (2026-04-20 to 2026-04-27) v498

Show details

src/hg/htdocs/goldenPath/newsarch.html
- lines changed 35, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 2, context: html, text, full: html, text
  d23d0116ff17b126a498c8d02bdef578d0ab1b53 Wed Apr 22 12:51:20 2026 -0700
  Update NMD Escape newsarch entry to match shipped Rule 2 definition. refs #33737 Rule 2 is no longer the 'intronless transcript rule' after the round 4 gate refinement (single coding exon AND no 3'UTR intron). Updated the newsarch entry to match.
- lines changed 16, context: html, text, full: html, text
  395a8efc6994c18a3b0bdfcee82217ff9d78b739 Wed Apr 22 12:54:59 2026 -0700
  Expand NMD Escape newsarch rules into sub-bullets. refs #33737 Break the four-rule summary into individual sub-bullets under the ruleset line so each rule is visible at a glance.
src/hg/htdocs/indexNews.html
- lines changed 12, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
src/hg/makeDb/doc/hg38/mpra.txt
- lines changed 93, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/hg/makeDb/doc/hg38/nmd.txt
- lines changed 3, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 22, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 14, context: html, text, full: html, text
  34d2eee845f5f45e571d1e153c632683b8a93f75 Tue Apr 21 16:17:53 2026 -0700
  Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737 Previously Rule 2 required exonCount==1 (truly intronless). This overcorrected for single-CDS-exon transcripts whose only introns are in the 5'UTR: biologically these have no EJC downstream of the stop codon (5'UTR EJCs are cleared by the scanning 40S or sit upstream of the terminating ribosome) and are NMD-immune, but the code pushed them to Rules 1/3 under a less accurate "last coding exon" label. New gate: len(cdsExons) == 1 AND no exon-exon junction strictly downstream of the stop codon (strand-aware). Transcripts with a single coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that intron deposits an EJC that can trigger NMD. 3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule 2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR introns correctly remain in Rules 1/3. Description page and makedoc updated.
src/hg/makeDb/doc/hg38/promoterAi.txt
- lines changed 19, context: html, text, full: html, text
  f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
  QA fixes for PromoterAI track. refs #37278 Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID 40440429), corrected the score-direction wording (negative = under-expression, positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover blurb to match mouseOverFunction noAverage behavior. Converter and AS: the overlap bigBed now carries the real per-transcript strand from the source TSV (was hardcoded '+'), with a new strands column in the AS, and the name field concatenates unique gene symbols so bidirectional-promoter items read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is meaningful. Rewrote the converter to stream (sorted input), which drops peak memory from ~40 GB to a few MB. trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw red (over-expression) above zero and blue (under-expression) below, matching the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap subtrack. Makedoc updated to describe the streaming pipeline, the new strands column, and the rebuild workflow.
- lines changed 11, context: html, text, full: html, text
  6c567fd9a03e87610681a43d2183ebb43547d1ad Fri Apr 24 17:58:57 2026 -0700
  PromoterAI: review followups. refs #37278 Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the underscore-prefix exclusion rule for hgdownload sync (same pattern as PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated. Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track default of 20 is too cramped for a signed -1..+1 score. Description page: drop the wrong primateai3d.basespace.illumina.com link in Data Access; PromoterAI is not on BaseSpace, it's distributed via the license agreement on the GitHub page (a download link is emailed after submission). Reword Data Access and Methods accordingly. Description page: add Illumina's recommended interpretation thresholds (|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with a note that higher cutoffs select smaller, higher-confidence sets.
src/hg/makeDb/scripts/mpravardb/mpravardbToBed.py
- lines changed 53, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/hg/makeDb/scripts/nmd/genePredNmdEsc
- lines changed 16, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 2, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 13, context: html, text, full: html, text
  34d2eee845f5f45e571d1e153c632683b8a93f75 Tue Apr 21 16:17:53 2026 -0700
  Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737 Previously Rule 2 required exonCount==1 (truly intronless). This overcorrected for single-CDS-exon transcripts whose only introns are in the 5'UTR: biologically these have no EJC downstream of the stop codon (5'UTR EJCs are cleared by the scanning 40S or sit upstream of the terminating ribosome) and are NMD-immune, but the code pushed them to Rules 1/3 under a less accurate "last coding exon" label. New gate: len(cdsExons) == 1 AND no exon-exon junction strictly downstream of the stop codon (strand-aware). Transcripts with a single coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that intron deposits an EJC that can trigger NMD. 3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule 2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR introns correctly remain in Rules 1/3. Description page and makedoc updated.
- lines changed 4, context: html, text, full: html, text
  fe73446acf43f70e385dadbbb281634adf3cac9e Tue Apr 21 16:44:16 2026 -0700
  NMD Escape QA tweaks: hide Gencode subtrack by default, bold rule numbers in mouseovers. refs #33737 - nmdEscGencode default visibility changed from on/dense to off/hide so only the RefSeq Curated subtrack is on by default. Per Lou's request. - RULE_DESCRIPTIONS mouseover strings wrap the rule number in <b>...</b> so the rule shows bold in the tooltip. Both bigBeds rebuilt.
src/hg/makeDb/scripts/nmd/nmdEscCollapsed.as
- lines changed 2, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
src/hg/makeDb/scripts/primateai/primateAi.as
- lines changed 2, context: html, text, full: html, text
  de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
  PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover. Variant analysts typically work at the nucleotide level, and the current item label (amino acid change) collapses distinguishable variants: ~17% of items share their (chrom, pos, AA-change) tuple with another item because of codon degeneracy (e.g. three C>A, C>G, C>T at the same position can all appear as "M>I"). Labeling by nucleotide change makes every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on hg19 from overlapping transcripts). - primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)"; new field aaChange (placed before ref/alt) holds the amino acid change. - primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column, and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and a colored prediction string. - primateAi.ra: add labelFields name,aaChange and defaultLabelFields name so users can toggle the on-feature label between nt change (default) and AA change. - primateAi.html: expand Display Conventions with the label-convention rationale and a legend for each mouseover field. refs #37274
src/hg/makeDb/scripts/primateai/primateAiToBigBed.py
- lines changed 13, context: html, text, full: html, text
  50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
  QA fixes for PrimateAI-3D track. Config (primateAi.ra): - Fix broken Ensembl transcript linkout: urls $S expanded to chromosome name; switch to the Ensembl transcript page with $$ - Add numeric filters on percentile and raw score (label notes the paper's 0.821 clinical threshold) - Add maxWindowToDraw 2000000 Data (primateAiToBigBed.py): - Change hardcoded strand '+' to '.': the source file has no strand column - Accept input/output paths as CLI args (previously hardcoded the hg38 input path) - Handle variable field count: ~2.4M rows in the hg19 source are missing the refseq column Description (primateAi.html): - Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way track - Regenerate the first reference via getTrackReferences (wrong article number and wrong PMC ID in the previous text) - Fix the GitHub URL for the conversion script in Methods - Move the Zoonomia 447-way mention out of Description; rephrase the license note to describe precisely what is disabled relatedTracks.ra: - Add reciprocal cross-links for primateAi <-> alphaMissense (hg38), primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi (hg38). Also includes promoterAi <-> alphaMissense cross-links. refs #37274 #37279
- lines changed 10, context: html, text, full: html, text
  de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
  PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover. Variant analysts typically work at the nucleotide level, and the current item label (amino acid change) collapses distinguishable variants: ~17% of items share their (chrom, pos, AA-change) tuple with another item because of codon degeneracy (e.g. three C>A, C>G, C>T at the same position can all appear as "M>I"). Labeling by nucleotide change makes every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on hg19 from overlapping transcripts). - primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)"; new field aaChange (placed before ref/alt) holds the amino acid change. - primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column, and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and a colored prediction string. - primateAi.ra: add labelFields name,aaChange and defaultLabelFields name so users can toggle the on-feature label between nt change (default) and AA change. - primateAi.html: expand Display Conventions with the label-convention rationale and a legend for each mouseover field. refs #37274
src/hg/makeDb/scripts/promoterAiOverlaps.as
- lines changed 6, context: html, text, full: html, text
  f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
  QA fixes for PromoterAI track. refs #37278 Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID 40440429), corrected the score-direction wording (negative = under-expression, positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover blurb to match mouseOverFunction noAverage behavior. Converter and AS: the overlap bigBed now carries the real per-transcript strand from the source TSV (was hardcoded '+'), with a new strands column in the AS, and the name field concatenates unique gene symbols so bidirectional-promoter items read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is meaningful. Rewrote the converter to stream (sorted input), which drops peak memory from ~40 GB to a few MB. trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw red (over-expression) above zero and blue (under-expression) below, matching the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap subtrack. Makedoc updated to describe the streaming pipeline, the new strands column, and the rebuild workflow.
src/hg/makeDb/scripts/promoterAiToBigWig.py
- lines changed 111, context: html, text, full: html, text
  f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
  QA fixes for PromoterAI track. refs #37278 Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID 40440429), corrected the score-direction wording (negative = under-expression, positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover blurb to match mouseOverFunction noAverage behavior. Converter and AS: the overlap bigBed now carries the real per-transcript strand from the source TSV (was hardcoded '+'), with a new strands column in the AS, and the name field concatenates unique gene symbols so bidirectional-promoter items read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is meaningful. Rewrote the converter to stream (sorted input), which drops peak memory from ~40 GB to a few MB. trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw red (over-expression) above zero and blue (under-expression) below, matching the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap subtrack. Makedoc updated to describe the streaming pipeline, the new strands column, and the rebuild workflow.
src/hg/makeDb/trackDb/human/hg38/mpra.html
- lines changed 13, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/hg/makeDb/trackDb/human/hg38/mpra.ra
- lines changed 27, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/hg/makeDb/trackDb/human/hg38/mpraVarDb.html
- lines changed 14, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/hg/makeDb/trackDb/human/hg38/mprabase.html
- lines changed 9, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/hg/makeDb/trackDb/human/hg38/nmd.html
- lines changed 4, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 5, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
src/hg/makeDb/trackDb/human/hg38/nmd.ra
- lines changed 8, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 13, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 2, context: html, text, full: html, text
  fe73446acf43f70e385dadbbb281634adf3cac9e Tue Apr 21 16:44:16 2026 -0700
  NMD Escape QA tweaks: hide Gencode subtrack by default, bold rule numbers in mouseovers. refs #33737 - nmdEscGencode default visibility changed from on/dense to off/hide so only the RefSeq Curated subtrack is on by default. Per Lou's request. - RULE_DESCRIPTIONS mouseover strings wrap the rule number in <b>...</b> so the rule shows bold in the tooltip. Both bigBeds rebuilt.
- lines changed 2, context: html, text, full: html, text
  a86b49667ad82b0f6c3745379f186f4d5753e368 Wed Apr 22 13:52:14 2026 -0700
  Simplify NMD Escape subtrack longLabels. refs #33737 The '50bp/100bp/intronless/400nt' rule-list became inaccurate after the Rule 2 refinement (Rule 2 now covers single coding exon + no 3'UTR intron, not just intronless). Drop the enumerated rules from the longLabel and defer to the track description page for rule detail.
src/hg/makeDb/trackDb/human/hg38/nmdDetective.html
- lines changed 2, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html
- lines changed 36, context: html, text, full: html, text
  3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
  Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737 Script: added a fourth rule to genePredNmdEsc. Coding exons longer than 400 bp (excluding the last coding exon, which is already covered by the 50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and NCBI RefSeq bigBed files. trackDb: - nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode default visibility to dense so the track is visible in cart-reset views, changed all four NMDetective subtracks from "visibility full" to "visibility hide", updated pennantIcon to the Apr. 22, 2026 release date and anchor. - nmd.html: mention long internal exons in the overview description, update the rule count from three to four. - nmdEscTranscripts.html: add the long-exon rule to the rule list and color legend (gold, #FFD700), expand the Background section with mechanisms for the intronless, start-proximal, and long-exon rules, correct the 50 bp rule description to include the entire last coding exon, fix Lindeboom 2016 author initials (RG -> RGH). News: - newsarch.html: add the 2026-04-22 NMD Escape news entry covering all four rules, with acknowledgements to Guido Neidhardt and Andreas Lahner for suggesting the track and the Decipher Genome Browser team for inspiring the visualization. - indexNews.html: add the front-page news link. makedoc: - nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 8, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 16, context: html, text, full: html, text
  34d2eee845f5f45e571d1e153c632683b8a93f75 Tue Apr 21 16:17:53 2026 -0700
  Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737 Previously Rule 2 required exonCount==1 (truly intronless). This overcorrected for single-CDS-exon transcripts whose only introns are in the 5'UTR: biologically these have no EJC downstream of the stop codon (5'UTR EJCs are cleared by the scanning 40S or sit upstream of the terminating ribosome) and are NMD-immune, but the code pushed them to Rules 1/3 under a less accurate "last coding exon" label. New gate: len(cdsExons) == 1 AND no exon-exon junction strictly downstream of the stop codon (strand-aware). Transcripts with a single coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that intron deposits an EJC that can trigger NMD. 3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule 2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR introns correctly remain in Rules 1/3. Description page and makedoc updated.
src/hg/makeDb/trackDb/human/hg38/trackDb.ra
- lines changed 1, context: html, text, full: html, text
  33e9019ef1b239ca1ab8114818f09ad65f58f2d0 Wed Apr 22 13:10:23 2026 -0700
  Release NMD Escape supertrack to beta + public. refs #33737 Drop the 'alpha' gate on include nmd.ra in hg38 trackDb.ra so the supertrack flows through the trackDb push pipeline to hgwbeta and the RR. /gbdb/hg38/nmd/*.bb files are already on the RR.
src/hg/makeDb/trackDb/human/primateAi.html
- lines changed 16, context: html, text, full: html, text
  50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
  QA fixes for PrimateAI-3D track. Config (primateAi.ra): - Fix broken Ensembl transcript linkout: urls $S expanded to chromosome name; switch to the Ensembl transcript page with $$ - Add numeric filters on percentile and raw score (label notes the paper's 0.821 clinical threshold) - Add maxWindowToDraw 2000000 Data (primateAiToBigBed.py): - Change hardcoded strand '+' to '.': the source file has no strand column - Accept input/output paths as CLI args (previously hardcoded the hg38 input path) - Handle variable field count: ~2.4M rows in the hg19 source are missing the refseq column Description (primateAi.html): - Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way track - Regenerate the first reference via getTrackReferences (wrong article number and wrong PMC ID in the previous text) - Fix the GitHub URL for the conversion script in Methods - Move the Zoonomia 447-way mention out of Description; rephrase the license note to describe precisely what is disabled relatedTracks.ra: - Add reciprocal cross-links for primateAi <-> alphaMissense (hg38), primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi (hg38). Also includes promoterAi <-> alphaMissense cross-links. refs #37274 #37279
- lines changed 28, context: html, text, full: html, text
  de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
  PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover. Variant analysts typically work at the nucleotide level, and the current item label (amino acid change) collapses distinguishable variants: ~17% of items share their (chrom, pos, AA-change) tuple with another item because of codon degeneracy (e.g. three C>A, C>G, C>T at the same position can all appear as "M>I"). Labeling by nucleotide change makes every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on hg19 from overlapping transcripts). - primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)"; new field aaChange (placed before ref/alt) holds the amino acid change. - primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column, and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and a colored prediction string. - primateAi.ra: add labelFields name,aaChange and defaultLabelFields name so users can toggle the on-feature label between nt change (default) and AA change. - primateAi.html: expand Display Conventions with the label-convention rationale and a legend for each mouseover field. refs #37274
- lines changed 11, context: html, text, full: html, text
  30374e3fc3390902c35bb463510567f1b6f7a96e Wed Apr 22 13:44:44 2026 -0700
  PrimateAI-3D: clarify origin of the 0.821 threshold per Max. refs #37274 Description previously juxtaposed the paper's 0.821 clinical threshold with the 75/25 benign/pathogenic split in a way that implied the two were related. Per Max on the ticket: the 0.821 threshold comes from Gao et al. 2023 Fig. 5A (calibrated against de novo missense excess in a clinical cohort, n=7,238 pathogenic calls), and the "prediction" column values are Illumina's own calls — not a simple application of the 0.821 threshold (some variants below it are labeled pathogenic and vice versa).
- lines changed 7, context: html, text, full: html, text
  6e61d3349b36cbcc01500c1483cc7bfbc141d9ea Wed Apr 22 13:47:33 2026 -0700
  PrimateAI-3D: tighten 0.821 threshold wording per the paper. refs #37274 Confirmed against Gao 2023 (PMC10713091): the calibration cohort is the Deciphering Developmental Disorders (DDD) neurodevelopmental cohort, not ClinVar. The cutoff was chosen so that the count of pathogenic calls (n=7,238) matched the excess of de novo missense mutations above the trinucleotide background expectation in that cohort.
src/hg/makeDb/trackDb/human/primateAi.ra
- lines changed 10, context: html, text, full: html, text
  50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
  QA fixes for PrimateAI-3D track. Config (primateAi.ra): - Fix broken Ensembl transcript linkout: urls $S expanded to chromosome name; switch to the Ensembl transcript page with $$ - Add numeric filters on percentile and raw score (label notes the paper's 0.821 clinical threshold) - Add maxWindowToDraw 2000000 Data (primateAiToBigBed.py): - Change hardcoded strand '+' to '.': the source file has no strand column - Accept input/output paths as CLI args (previously hardcoded the hg38 input path) - Handle variable field count: ~2.4M rows in the hg19 source are missing the refseq column Description (primateAi.html): - Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way track - Regenerate the first reference via getTrackReferences (wrong article number and wrong PMC ID in the previous text) - Fix the GitHub URL for the conversion script in Methods - Move the Zoonomia 447-way mention out of Description; rephrase the license note to describe precisely what is disabled relatedTracks.ra: - Add reciprocal cross-links for primateAi <-> alphaMissense (hg38), primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi (hg38). Also includes promoterAi <-> alphaMissense cross-links. refs #37274 #37279
- lines changed 2, context: html, text, full: html, text
  de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
  PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover. Variant analysts typically work at the nucleotide level, and the current item label (amino acid change) collapses distinguishable variants: ~17% of items share their (chrom, pos, AA-change) tuple with another item because of codon degeneracy (e.g. three C>A, C>G, C>T at the same position can all appear as "M>I"). Labeling by nucleotide change makes every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on hg19 from overlapping transcripts). - primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)"; new field aaChange (placed before ref/alt) holds the amino acid change. - primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column, and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and a colored prediction string. - primateAi.ra: add labelFields name,aaChange and defaultLabelFields name so users can toggle the on-feature label between nt change (default) and AA change. - primateAi.html: expand Display Conventions with the label-convention rationale and a legend for each mouseover field. refs #37274
- lines changed 1, context: html, text, full: html, text
  d07e0de4fba2fc825dd1fdaa37a7cf1f66e4721d Fri Apr 24 17:36:42 2026 -0700
  PrimateAI-3D: move /gbdb dir to _primateAi/ to match the underscore-prefix exclusion rule for hgdownload sync. refs #37274
src/hg/makeDb/trackDb/human/promoterAi.html
- lines changed 51, context: html, text, full: html, text
  f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
  QA fixes for PromoterAI track. refs #37278 Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID 40440429), corrected the score-direction wording (negative = under-expression, positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover blurb to match mouseOverFunction noAverage behavior. Converter and AS: the overlap bigBed now carries the real per-transcript strand from the source TSV (was hardcoded '+'), with a new strands column in the AS, and the name field concatenates unique gene symbols so bidirectional-promoter items read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is meaningful. Rewrote the converter to stream (sorted input), which drops peak memory from ~40 GB to a few MB. trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw red (over-expression) above zero and blue (under-expression) below, matching the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap subtrack. Makedoc updated to describe the streaming pipeline, the new strands column, and the rebuild workflow.
- lines changed 16, context: html, text, full: html, text
  6c567fd9a03e87610681a43d2183ebb43547d1ad Fri Apr 24 17:58:57 2026 -0700
  PromoterAI: review followups. refs #37278 Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the underscore-prefix exclusion rule for hgdownload sync (same pattern as PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated. Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track default of 20 is too cramped for a signed -1..+1 score. Description page: drop the wrong primateai3d.basespace.illumina.com link in Data Access; PromoterAI is not on BaseSpace, it's distributed via the license agreement on the GitHub page (a download link is emailed after submission). Reword Data Access and Methods accordingly. Description page: add Illumina's recommended interpretation thresholds (|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with a note that higher cutoffs select smaller, higher-confidence sets.
src/hg/makeDb/trackDb/human/promoterAi.ra
- lines changed 24, context: html, text, full: html, text
  f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
  QA fixes for PromoterAI track. refs #37278 Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID 40440429), corrected the score-direction wording (negative = under-expression, positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover blurb to match mouseOverFunction noAverage behavior. Converter and AS: the overlap bigBed now carries the real per-transcript strand from the source TSV (was hardcoded '+'), with a new strands column in the AS, and the name field concatenates unique gene symbols so bidirectional-promoter items read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is meaningful. Rewrote the converter to stream (sorted input), which drops peak memory from ~40 GB to a few MB. trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw red (over-expression) above zero and blue (under-expression) below, matching the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap subtrack. Makedoc updated to describe the streaming pipeline, the new strands column, and the rebuild workflow.
- lines changed 9, context: html, text, full: html, text
  6c567fd9a03e87610681a43d2183ebb43547d1ad Fri Apr 24 17:58:57 2026 -0700
  PromoterAI: review followups. refs #37278 Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the underscore-prefix exclusion rule for hgdownload sync (same pattern as PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated. Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track default of 20 is too cramped for a signed -1..+1 score. Description page: drop the wrong primateai3d.basespace.illumina.com link in Data Access; PromoterAI is not on BaseSpace, it's distributed via the license agreement on the GitHub page (a download link is emailed after submission). Reword Data Access and Methods accordingly. Description page: add Illumina's recommended interpretation thresholds (|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with a note that higher cutoffs select smaller, higher-confidence sets.
src/hg/makeDb/trackDb/relatedTracks.ra
- lines changed 15, context: html, text, full: html, text
  50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
  QA fixes for PrimateAI-3D track. Config (primateAi.ra): - Fix broken Ensembl transcript linkout: urls $S expanded to chromosome name; switch to the Ensembl transcript page with $$ - Add numeric filters on percentile and raw score (label notes the paper's 0.821 clinical threshold) - Add maxWindowToDraw 2000000 Data (primateAiToBigBed.py): - Change hardcoded strand '+' to '.': the source file has no strand column - Accept input/output paths as CLI args (previously hardcoded the hg38 input path) - Handle variable field count: ~2.4M rows in the hg19 source are missing the refseq column Description (primateAi.html): - Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way track - Regenerate the first reference via getTrackReferences (wrong article number and wrong PMC ID in the previous text) - Fix the GitHub URL for the conversion script in Methods - Move the Zoonomia 447-way mention out of Description; rephrase the license note to describe precisely what is disabled relatedTracks.ra: - Add reciprocal cross-links for primateAi <-> alphaMissense (hg38), primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi (hg38). Also includes promoterAi <-> alphaMissense cross-links. refs #37274 #37279
- lines changed 6, context: html, text, full: html, text
  4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
  NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737 Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all) to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models) per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts". Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of len(cdsExons)==1. The old test misclassified multi-exon transcripts with a single CDS exon (UTR introns) as "intronless" and silently suppressed their Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt both tracks. Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a penultimate coding exon shorter than 50 bp. Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq. QA cleanups: non-ASCII prime char replaced with ′, mailing list links given target="_blank" across all three HTML pages, dead commented nmdGencode block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4 color and the gene-symbol-to-transcript-ID fallback. Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 6, context: html, text, full: html, text
  888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
  QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
src/utils/redmineCli
- lines changed 8, context: html, text, full: html, text
  993da626132958795cab63a9b26d64ce2052f40d Tue Apr 21 16:51:13 2026 -0700
  Make redmineCli prepend_attribution idempotent. refs #37339 Skip adding the '**From Claude:**' header if the body already begins with a From Claude attribution line (any bold/italic asterisk variant, case-insensitive). Fixes the periodic doubled header when Claude models mimic prior journal entries that already carried the prefix.

switch to commits view, user index