888e7470c14eeecdca310ed36bb45c3c00ae8052 lrnassar Tue Apr 21 15:14:04 2026 -0700 QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log. diff --git src/hg/makeDb/trackDb/human/hg38/mpra.ra src/hg/makeDb/trackDb/human/hg38/mpra.ra index e50fbd4ae2f..fb5086693a9 100644 --- src/hg/makeDb/trackDb/human/hg38/mpra.ra +++ src/hg/makeDb/trackDb/human/hg38/mpra.ra @@ -1,32 +1,50 @@ track mpra shortLabel MPRAs longLabel Massively Parallel Reporter Assays group regulation -type bed 4 visibility hide superTrack on track mprabase shortLabel MPRA Base longLabel MPRAs: MPRA Base Enhancer Elements type bigBed 9 + 9 bigDataUrl /gbdb/hg38/mpra/mprabase/mprabase.bb - parent mpra + parent mpra on visibility pack priority 1 itemRgb on - mouseOver "$cell_line | $assay | Score: $raw_score | %tile: $percentile_rank | $citation" + mouseOver $cell_line | $assay | Score: $raw_score | %tile: $percentile_rank | $citation urls PMID="https://www.ncbi.nlm.nih.gov/pubmed/$$" + labelFields name,cell_line,assay,author_lab + defaultLabelFields name + filterValues.cell_line HepG2,HUES64,mESC,NPC,HEK293FT,UACC903,Hela + filterValues.assay lentiMPRA (LM),plasmidMPRA (PM),STARR-seq (ST) + filter.percentile_rank 0:100 + filterByRange.percentile_rank on + filterLimits.percentile_rank 0:100 + filterLabel.percentile_rank Filter by activity percentile rank (within experiment) track mpraVarDb shortLabel MPRAVarDB - longLabel MPRAs: MPRAVarDB - MPRA-tested Regulatory Variant Effects - grey is non-significant + longLabel MPRAs: MPRAVarDB - MPRA-tested Regulatory Variant Effects parent mpra on - bigDataUrl /gbdb/hg38/mpra/mpravardb.bb - type bigBed 9 + + bigDataUrl /gbdb/hg38/mpra/mpravardb/mpravardb.bb + type bigBed 9 + 13 itemRgb on visibility dense - mouseOverField _mouseOverLog2FC,_mouseOverPvalue,_mouseOverFdr + priority 2 + maxWindowToDraw 10000000 mouseOver Ref: $ref Alt: $alt Cell: $cellLine log2FC: $_mouseOverLog2FC p: $_mouseOverPvalue FDR: $_mouseOverFdr skipFields _mouseOverLog2FC,_mouseOverPvalue,_mouseOverFdr - priority 2 + labelFields name,cellLine,disease + defaultLabelFields name + filterValues.cellLine GM12878,PC3,HepG2,K562,Jurkat,HEK293FT,HEK293T,SKNSH,HMEC,HNPS,HEK293s,HEL92.1.7,Neuro-2a,MIN6,NIH/3T3,N2A,SH-SY5Y,SF7996,HaCaT,HeLa,LNCaP,MOLP8,L363,Saos-2,C283T,UACC903,BLA,CE,NAC,SFC + filter.fdr 0:1 + filterByRange.fdr on + filterLimits.fdr 0:1 + filterLabel.fdr Filter by false discovery rate + filter.log2FC -5:5 + filterByRange.log2FC on + filterLimits.log2FC -5:5 + filterLabel.log2FC Filter by log2 fold change (alt vs ref)