888e7470c14eeecdca310ed36bb45c3c00ae8052 lrnassar Tue Apr 21 15:14:04 2026 -0700 QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log. diff --git src/hg/makeDb/trackDb/relatedTracks.ra src/hg/makeDb/trackDb/relatedTracks.ra index 754ef48affe..0c90ca47313 100644 --- src/hg/makeDb/trackDb/relatedTracks.ra +++ src/hg/makeDb/trackDb/relatedTracks.ra @@ -112,15 +112,21 @@ hg19 primateAi revel REVEL, an ensemble missense pathogenicity score built from multiple predictors hg19 revel primateAi PrimateAI-3D, a missense pathogenicity predictor using primate variation and 3D protein structure # PromoterAI cross-links: hg38 promoterAi primateAi PrimateAI-3D, a companion deep-learning model from Illumina for coding (missense) variants hg38 primateAi promoterAi PromoterAI, a companion deep-learning model from Illumina for non-coding promoter variants hg38 promoterAi alphaMissense AlphaMissense, a deep-learning predictor of missense (coding) variant pathogenicity hg38 alphaMissense promoterAi PromoterAI, a deep-learning predictor of expression-altering variants in promoter regions # NMD Escape cross-links: hg38 nmd mane MANE Select transcripts from NCBI/EBI, a curated subset of RefSeq/Ensembl transcripts used as clinical reference hg38 mane nmd NMD Escape: predicted regions where premature termination codons escape nonsense-mediated decay hg38 nmd ncbiRefSeq NCBI RefSeq transcripts, the source annotation set for the NMD Escape RefSeq subtrack hg38 ncbiRefSeq nmd NMD Escape: predicted regions where premature termination codons escape nonsense-mediated decay + +# MPRA cross-links: +hg38 mpra wgEncodeReg4 ENCODE regulatory region annotations, many of which are tested by MPRA assays +hg38 wgEncodeReg4 mpra Experimental MPRA measurements of regulatory activity for candidate elements +hg38 mpra cCREs Candidate cis-regulatory elements; many overlap MPRA-tested fragments +hg38 cCREs mpra Experimentally validated regulatory activity from MPRA assays for overlapping elements