17b7d3c37be41135afaf8e91e365e3847af96ca5 lrnassar Mon Jun 22 10:56:56 2026 -0700 Add TAD (topologically associating domains) track set on hg19, hg38, mm10, mm39. refs #21599 New "tads" superTrack collecting published TAD calls, alpha-gated via include tad.ra alpha in each assembly's trackDb.ra. hg38 (all five sources): Dixon 2012 domains, Schmitt 2016 boundaries, McArthur & Capra 2021 boundary stability, ENCODE contact domains (faceted composite over 117 biosamples), and 3D Genome Browser 2.0 domains (faceted composite over 464 datasets). hg19: the three sources with hg19-compatible data (Dixon, Schmitt, McArthur). mm10/mm39 (domains only; the boundary sources have no mouse data): Dixon, ENCODE (faceted, 16 biosamples), and 3D Genome Browser (faceted, 30 datasets); mm39 lifted from mm10, lift noted in the long labels. Faceted composites are organ-colored from a TAD-owned organ_colors.json symlinked into /gbdb//bbi/tad/. Build scripts and autoSql are version-controlled under makeDb/scripts/tad/ and symlinked into the per-source build dirs. Provenance and fetch for every dataset are documented in the makedocs (doc/hg38/tad.txt, doc/mm10/tad.txt, doc/mm39/tad.txt, and the hg19 TAD section in doc/hg19.txt). diff --git src/hg/makeDb/trackDb/human/hg38/tads.html src/hg/makeDb/trackDb/human/hg38/tads.html new file mode 100644 index 00000000000..f171171d054 --- /dev/null +++ src/hg/makeDb/trackDb/human/hg38/tads.html @@ -0,0 +1,112 @@ +

Description

+

+This track set displays topologically associating domains (TADs) and TAD +boundaries in the human genome, assembled from several published Hi-C studies. +TADs are self-interacting regions of the genome, typically hundreds of kilobases +to about a megabase, and themselves nested, with smaller contact domains contained within +larger top-level TADs. Their boundaries (frequently bound by CTCF and cohesin) insulate +neighboring regions and constrain enhancer-promoter contacts. Disruption of a TAD boundary +can rewire gene regulation and cause disease, and TADs are widely used to nominate candidate +target genes for non-coding variants. +

+

The set contains five complementary sources:

+ + +

How to Use These Tracks

+

+The domain tracks (Dixon, ENCODE, 3D Genome Browser) answer "are my variant +and a candidate gene in the same TAD?" and help prioritize target genes at +non-coding GWAS loci. The boundary tracks (Schmitt, stability) answer "does my +structural variant disrupt an insulating boundary?" and help interpret +the regulatory impact of deletions, duplications, and inversions. Because the domain tracks +are nested (ENCODE calls smaller sub-TAD contact domains; Dixon and the 3D Genome Browser +call larger top-level TADs), "which TAD?" is answered at different scales by +different tracks. +

+ +

Display Conventions and Configuration

+

+Each source is shown as a separate track because TAD calls are not directly +comparable across studies: different algorithms (directionality index/HMM, +insulation score, Arrowhead) and resolutions (5–100 kb) produce different calls +of the same underlying biology. Domains are drawn as boxes spanning each +self-interacting region; boundaries are drawn as the short bins that divide +adjacent domains. Because calls are made on binned data, domain edges are uncertain to +roughly the caller's bin size (from a few kilobases for the ENCODE 5 kb calls up to about +±50 kb for the 100 kb stability bins), and the bin width of a boundary feature +reflects this localization precision, not a measured physical width. Domains do not +tile the genome end to end; the gaps between domain boxes are inter-domain or unorganized +regions, not display artifacts. The ENCODE and 3D Genome Browser tracks each +contain many biosamples and are browsable with a faceted selector on their track +configuration pages; a small default set is shown and the rest are enabled through the +facets. +

+ +

Methods

+

+See the individual subtrack description pages for full methods, source publications, and +assembly/liftOver details for each dataset. In brief: Dixon domains were called with the +directionality-index HMM at 40 kb; Schmitt boundaries with the insulation-score method at +40 kb; ENCODE contact domains with Arrowhead (Juicer) on the ENCODE uniform Hi-C +pipeline; the 3D Genome Browser domains are that resource's own per-dataset TAD calls +(25 kb) across 464 human datasets, shown verbatim (format normalization only); and the +boundary-stability track counts, per 100 kb window, how many of 37 re-processed cell-type +maps share a boundary (McArthur & Capra 2021). +

+ +

Data Access

+

+The raw data can be explored interactively with the +Table Browser or the +Data Integrator. For programmatic access, the +track can be accessed using the Genome Browser's +REST API. +The underlying bigBed files can be downloaded from our +download server. +

+ +

References

+

+Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. +Topological domains in mammalian genomes identified by analysis of chromatin +interactions. Nature. 2012;485(7398):376-80. +doi:10.1038/nature11082 +

+

+McArthur E, Capra JA. Topologically associating domain boundaries that are stable across +diverse cell types are evolutionarily constrained and enriched for heritability. +Am J Hum Genet. 2021;108(2):269-283. +doi:10.1016/j.ajhg.2021.01.001 +

+

+Rao SS, Huntley MH, Durand NC, Stamenova EK, et al. +A 3D map of the human genome at kilobase resolution reveals principles of chromatin +looping. Cell. 2014;159(7):1665-80. +doi:10.1016/j.cell.2014.11.021 +

+

+Schmitt AD, Hu M, Jung I, Xu Z, et al. +A Compendium of Chromatin Contact Maps Reveals Spatially Active Regions in the Human +Genome. Cell Rep. 2016;17(8):2042-2059. +doi:10.1016/j.celrep.2016.10.061 +

+

+Yu S, Fu Y, Wong JH, Wang J, Zhao H, Zhao J, Yue F. +The 3D Genome Browser 2.0: an enhanced online platform for visualizing and analyzing 3D +genome architecture. Nucleic Acids Res. 2026;54(D1):D48-D54. +doi:10.1093/nar/gkaf1109 +