9de039a7dceb056ccfa604e0ac38e0bb901ef1ec max Mon Mar 30 17:11:20 2026 -0700 MPRA track updates, #34284 diff --git src/hg/makeDb/trackDb/human/hg38/mprabase.html src/hg/makeDb/trackDb/human/hg38/mprabase.html new file mode 100644 index 00000000000..837554a8c0c --- /dev/null +++ src/hg/makeDb/trackDb/human/hg38/mprabase.html @@ -0,0 +1,129 @@ +
+Massively Parallel Reporter Assays (MPRAs) and related methods such as STARR-seq +enable quantitative testing of thousands of candidate regulatory DNA sequences in +parallel by linking each sequence to a reporter gene and measuring transcriptional +output using sequencing. +
+ ++The MPRA Base track shows 41,275 experimentally tested cis-regulatory elements +from the MPRA Base +database +(Zhao et al., 2023). +The database integrates data from multiple studies, assay platforms (lentiMPRA, +plasmidMPRA, STARR-seq, CRE-seq, and others), and cell types while preserving +experiment-level resolution. Only elements derived from genomic fragments that can +be mapped to the reference genome are included; synthetic or designed oligonucleotide +libraries without genomic coordinates are excluded. +
+ ++Each item represents a genomic fragment tested within a specific experiment, defined +as a unique combination of cell line, assay type, and publication (PMID). The same +genomic region may appear multiple times if tested in different experiments. +
+ ++Items are colored by percentile rank of the mean raw activity score within each experiment: +
++The mouse-over shows the cell line, assay type, raw activity score, percentile rank, +and citation for each element. +
+ ++Within each experiment, replicate measurements for the same genomic fragment were +aggregated by computing the mean raw activity score. The original dataset contained +211,053 replicate-level measurements; after aggregation, the final track contains +41,275 unique experiment-level genomic elements. +
+ ++Elements are ranked by mean raw activity score independently within each experiment, +and a percentile rank (0–100) is computed per experiment to avoid cross-study +distortions caused by differing assay dynamic ranges. +
+ ++The following table lists the experiments represented in this track. +
+ +| PMID | +Author | +Year | +Lab | +Cell type | +Assay | +Elements | +
|---|---|---|---|---|---|---|
| 27831498 | Inoue et al. | 2017 | Shendure Lab | HepG2 | lentiMPRA | 2,241 |
| 30045748 | Klein et al. | 2018 | Shendure Lab | HepG2 | STARR-seq | 7,064 |
| 32483191 | Choi et al. | 2020 | Brown Lab | HEK293FT | lentiMPRA | 840 |
| 32483191 | Choi et al. | 2020 | Brown Lab | UACC903 | lentiMPRA | 840 |
| 32819422 | Mattioli et al. | 2020 | Mele Lab | HUES64 | plasmidMPRA | 6,954 |
| 32819422 | Mattioli et al. | 2020 | Mele Lab | mESC | plasmidMPRA | 6,954 |
| 33046894 | Klein et al. | 2020 | Shendure Lab | HepG2 | lentiMPRA | 8,116 |
| 33046894 | Klein et al. | 2020 | Shendure Lab | HepG2 | plasmidMPRA | 2,228 |
| 33046894 | Klein et al. | 2020 | Shendure Lab | HepG2 | STARR-seq | 2,230 |
| 36834916 | Koesterich et al. | 2023 | Kreimer Lab | NPC | lentiMPRA | 3,807 |
+The data can be explored interactively in table format with the +Table Browser or the +Data Integrator +and exported from there to spreadsheet or tab-sep tables. +From scripts, the data can be accessed through our +API, track=mprabase. +
++For automated download and analysis, the genome annotation is stored in a bigBed +file that can be downloaded from +our download server. +The file for this track is called mprabase.bb. Individual +regions or the whole genome annotation can be obtained using our tool +bigBedToBed, which can be compiled from the source code or downloaded as a +precompiled binary for your system. Instructions for downloading source code and +binaries can be found +here. +The tool can also be used to obtain features within a given range, e.g. +bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/mpra/mprabase/mprabase.bb -chrom=chr21 -start=0 -end=100000000 stdout +
++The original data can be downloaded from the +MPRA Base web application. +
+ ++Thanks to Varda Singhal, Jianyu Zhao, and the +Ahituv Lab +at the University of California San Francisco for creating and curating MPRA Base and for creating this track. +
+ ++Zhao J, Baltoumas FA, Konnaris MA, Mouratidis I, Liu Z, Sims J, Agarwal V, Pavlopoulos GA, +Georgakopoulos-Soares I, Ahituv N. + +MPRAbase: A Massively Parallel Reporter Assay Database. +bioRxiv. 2023 Nov 22;. +PMID: 38045264; PMC: PMC10690217 +
+