8a3f24f0e70e94dc4812283bbd60c715651d6c46 lrnassar Mon Nov 3 14:50:27 2025 -0800 Staging the ENCODE4 cCREs track. Next step is asking for feedback from the authors, refs #34923 diff --git src/hg/makeDb/trackDb/human/hg38/cCREregistry.html src/hg/makeDb/trackDb/human/hg38/cCREregistry.html new file mode 100644 index 00000000000..bc99e2638b8 --- /dev/null +++ src/hg/makeDb/trackDb/human/hg38/cCREregistry.html @@ -0,0 +1,121 @@ +

Description

+

+This track displays the ENCODE Registry of candidate cis-Regulatory Elements (cCREs) +in the human genome from ENCODE 4. A total of 2,348,854 elements identified and classified by the +ENCODE Data Analysis Center according to biochemical signatures. Most cCREs are anchored on +DNase hypersensitive sites further annotated with histone modifications (H3K4me3 and H3K27ac) +or CTCF binding measured by ChIP-seq experiments. In this latest version of the Registry (V4), +the representative DNase hypersensitive sites (rDHSs) were supplemented +with 86,748 representative transcription factor ChIP-seq peaks (TF +rPeaks)—peaks that represent binding sites for at least five TFs. The Registry of cCREs is +one of the core components of the integrative level of the ENCODE Encyclopedia of DNA Elements.

+ +

Additional exploration of the cCREs and underlying raw ENCODE signal data can be done with the +Core Collection track. The data is also available on the SCREEN (Search Candidate cis-Regulatory +Elements) web tool, designed specifically for the Registry, accessible by item mouseovers and linkouts from the +track details page.

+ +

Display Conventions and Configurations

+

+Each cCRE is displayed as a colored box by type, which reflects its putative functional assignment +based on biochemical signatures and genomic context:

+

+Graphic of cCRE classifications

+

+Mousing over the data will display the accession ID, the assigned cCRE class type, and the Max-Z scores +for the various underlying biosignals (DNase, H3K4me3, H3K27ac, CTCF). A track filter is also available +to selectively show items based on their cCRE class type.

+ +

Methods

+

+Candidate cis-regulatory elements (cCREs) were first anchored on nucleosome-sized DNase +hypersensitive sites (rDHSs) identified from DNase-seq data. These rDHSs were then annotated +using ChIP-seq data for histone modifications—H3K4me3 and H3K27ac, marking promoters and +enhancers, respectively—and CTCF, marking insulators. To supplement rDHS-anchored cCRE +definitions, transcription factor ChIP-seq peaks were incorporated, enabling identification +of cCREs even in regions of low chromatin accessibility. Although not used for anchoring, +ATAC-seq data were used to assess chromatin accessibility in biosamples lacking DNase-seq.

+ +

+Classification of cCRE's was performed based on the following criteria:

+
    +
  1. Promoter-like signatures (promoter) +must fall within 200 bp of a TSS and have high chromatin accessibility and H3K4me3 signals.
  2. +
  3. TSS-proximal enhancer-like signatures (proximal +enhancer) have high chromatin accessibility and H3K27ac signals and are +within 2 kb of an annotated TSS. If they are within 200 bp of a TSS, they must +also have low H3K4me3 signal.
  4. +
  5. TSS-distal enhancer-like signatures +(distal enhancer) have high chromatin accessibility and H3K27ac signals +and are farther than 2 kb from an annotated TSS.
  6. +
  7. Chromatin accessibility + +H3K4me3 (CA-H3K4me3) have high chromatin accessibility and H3K4me3 +signals but low H3K27ac signals and do not fall within 200 bp of a TSS.
  8. +
  9. Chromatin accessibility + +CTCF (CA-CTCF)have high chromatin accessibility and CTCF signals +but low H3K4me3 and H3K27ac signals.
  10. +
  11. Chromatin accessibility + +transcription factor (CA-TF) have high chromatin accessibility, +low H3K4me3, H3K27ac, and CTCF signals, and are bound by a transcription factor.
  12. +
  13. Chromatin accessibility +(CA)have high chromatin accessibility and low H3K4me3, H3K27ac, and +CTCF signals.
  14. +
  15. Transcription factor +(TF) have low chromatin accessibility, low H3K4me3, H3K27ac, +and CTCF signals and are bound by a transcription factor.
  16. +
+ +

Data Access

+

+The ENCODE accession numbers of the constituent datasets at the ENCODE Portal are available from the cCRE details page.

+

+The data in this track can be interactively explored with the Table Browser or the Data Integrator. The data can be accessed from +scripts through our a API, +the track name is "cCREregistry".

+

+For automated download and analysis, this annotation is stored in a bigBed file +that can be downloaded from our download server. +The file for this track is called cCREregistry.bb. Individual regions or the whole genome +annotation can be obtained using our tool bigBedToBed which can be compiled from the source +code or downloaded as a precompiled binary for your system. Instructions for downloading +source code and binaries can be found here. +The tool can also be used to obtain only features within a given range, e.g.

+bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/encode4/ccre/cCREregistry.bb -chrom=chr21 -start=0 -end=100000000 stdout

+ +

Credits

+

+Data were generated by the ENCODE Consortium. The data were further processed for +visualization through a collaborative effort between the Weng lab and the Moore lab at UMass Chan Medical +School (funded by NIH grant HG012343). Integration and visualization were developed +by Drs. Mingshi Gao, Jill Moore, and Zhiping Weng at UMass Chan Medical School, who were +part of the ENCODE Data Analysis Center. We thank the ENCODE production labs +for generating the data.

+ +

References

+

+ENCODE Project Consortium, Moore JE, Purcaro MJ, Pratt HE, Epstein CB, Shoresh N, Adrian J, Kawli T, +Davis CA, Dobin A et al. + +Expanded encyclopaedias of DNA elements in the human and mouse genomes. +Nature. 2020 Jul;583(7818):699-710. +PMID: 32728249; PMC: PMC7410828 +

+

+Moore JE, Pratt HE, Fan K, Phalke N, Fisher J, Elhajjajy SI, Andrews G, Gao M, Shedd N, Fu Y et +al. + +An Expanded Registry of Candidate cis-Regulatory Elements for Studying Transcriptional +Regulation. +bioRxiv. 2024 Dec 26;. +PMID: 39763870; PMC: PMC11703161 +