888e7470c14eeecdca310ed36bb45c3c00ae8052 lrnassar Tue Apr 21 15:14:04 2026 -0700 QA fixes for MPRA superTrack. refs #37359 Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing hgTrackDb -strict to silently drop the subtrack. Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8 in user-visible string fields (curly quotes, primes, NBSP mojibake) that the browser does not transcode, eliminating ~246k non-ASCII occurrences across 42% of rows; and change safe_float / pval_to_score to write NaN and return score 0 for NA / out-of-range p-values instead of 0.0 and score 1000 (previously inflated untested variants to the top of score-sorted views). trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant mouseOverField, align parent mpra on, add filterValues for cell_line/assay/cellLine and filterByRange sliders for percentile_rank / fdr / log2FC, add labelFields and maxWindowToDraw. Description pages: add cross-species disclosure (mouse reporter cells used to assay human sequences), update mpraVarDb header to post-liftOver count 239,028 with Studies-table footnote, fix mpraVarDb.html download-server paths, soften imprecise "51 MPRA experiments" claim in mpra.html and mprabase.html. relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs. Expand mpra.txt makedoc with upstream provenance and QA-rebuild log. diff --git src/hg/makeDb/trackDb/human/hg38/mpra.html src/hg/makeDb/trackDb/human/hg38/mpra.html index 2d9d9aefb98..fef166bcacf 100644 --- src/hg/makeDb/trackDb/human/hg38/mpra.html +++ src/hg/makeDb/trackDb/human/hg38/mpra.html @@ -1,36 +1,45 @@
Massively Parallel Reporter Assays (MPRAs) are high-throughput experimental methods that measure transcriptional output of thousands of short DNA sequences using sequencing. If in addition, a mutated sequence is tested, the impact of a genetic variant can be quantified.
This track collection brings together results from two MPRA databases, one for the complete sequence fragments, one for the impact of variants in selected fragments:
++Note on cell lines: The cell line shown for each element or variant is the +reporter cell line in which the human sequence was assayed. Several studies used +mouse cell lines (e.g. Neuro-2a, N2A, NIH/3T3, MIN6, mESC) as reporter systems +for human regulatory sequences; all items retain human (hg38) coordinates. +
+See the individual subtrack documentation pages linked above for detailed information on how to download and intersect the annotations.
Thanks to Tao Wang and colleagues at the University of Florida for MPRAVarDB, and to Varda Singhal and the