d0b655a92d2b8ebfd006699ecc0dc888ca2cc1e5 kate Wed May 20 16:52:35 2020 -0700 Polish schema, track description, and config based on ccomprehensive review and input by ENCODE DAC. refs #24668 diff --git src/hg/makeDb/trackDb/encodeCcreCombined.html src/hg/makeDb/trackDb/encodeCcreCombined.html deleted file mode 100644 index b359eef..0000000 --- src/hg/makeDb/trackDb/encodeCcreCombined.html +++ /dev/null @@ -1,164 +0,0 @@ -<h2>Description</h2> -<p> -This track displays the <em>ENCODE Registry of candidate cis-Regulatory Elements</em> (cCREs) -in the genome, identified and classified by the ENCODE Data Analysis Center according to -regulatory effect. -CCREs are the subset of DNase hypersensitivity sites clustered across all samples -that are supported by either histone modifications (H3K4me3 and H3K27ac) or CTCF-binding data. -The cCRE dataset is the core of the integrative level of epigenomic and transcriptomic -annotations produced by ENCODE. -The registry currently comprises a total of 926,535 elements in the human genome and 339,815 -in the mouse genome (less comprehensively assayed). -</p> -Additional exploration of the cCRE's and underlying raw ENCODE data is provided by the -<a target="_blank" href="https://screen.wenglab.org/"> -SCREEN</a> -(Search Candidate cis-Regulatory Elements) web tool, -designed specifically for the registry, and accessible by linkouts from the track details page. - -<!-- -<p> -The related cCREs by Biosample composite track presents ccREs -and associated epigenetic signal in all individual biosamples in a large matrix. -Additional views of the data are provided by the <a>ENCODE Integrative Megahub</a>. -</p> ---> - -<h2>Display Conventions and Configuration</h2> -<p> -CCREs are colored and labeled according to classification by regulatory signature: -<p> -<table cellpadding='2'> - - <tr> - <th style="border-bottom: 2px solid;">Color</th> - <th style="border-bottom: 2px solid;"></th> - <th style="border-bottom: 2px solid;">Label</th> - <th style="border-bottom: 2px solid;">Classification</th> - <th style="border-bottom: 2px solid;"></th> - </tr> - </thead> - -<tr><td style='background-color: red;'></td><td>red</td> - <td>prom</td> - <td>promoter-like</td> - <td>PLS</td></tr> -<tr><td style='background-color: orange;'></td><td>orange</td> - <td>enhP</td> - <td>proximal enhancer-like</td> - <td>pELS</td></tr> -<tr><td style='background-color: yellow;'></td><td>yellow</td> - <td>enhD</td> - <td>distal enhancer-like</td> - <td>dELS</td></tr> -<tr><td style='background-color: blue;'</td><td>blue</td> - <td>CTCF</td> - <td>CTCF-only</td> - <td>CTCF-only</td></tr> -<tr><td style='background-color: #ffa0a0;'></td><td>pink</td> - <td>K4m3</td> - <td>DNase-H3K4me3</td> - <td>DNase-H3K4me3</td></tr> -</table> -</p> -<p> -The DNase-H3K4me3 elements are those with promoter-like biochemical signature that -are not within 200bp of an annotated TSS. -</p> - -<h2>Methods</h2> -<p> -All individual DNase hypsersensitivity sites (DHSs) identified from DNAse-seq experiments -(in human, a total of 93 million sites from 706 experiments) were iteratively clustered -and filtered for highest signal across all experiments, producing -representative DHSs (rDHSs), with a total of 2.2 million such sites in human. -The highest signal elements from this set that were also supported by high H3K4me3, H3K27ac -and/or CTCF ChIP-seq signals were designated cCRE's (a total of 926,535 in human). -</p> -<p> -Classification of cCRE's was performed based on the following criteria: -<p> -<p> -1. cCREs with promoter-like signatures (cCRE-PLS) fall within 200 bp of -an annotated GENCODE TSS and have high DNase and H3K4me3 signals. -</p> -<p> -2. cCREs with enhancer-like signatures (cCRE-ELS) have high DNase and H3K27ac -with low H3K4me3 max-Z score if they are within 200 bp of an annotated TSS. -The subset of cCREs-ELS within 2 kb of a TSS is denoted proximal (cCRE-pELS), -while the remaining subset is denoted distal (cCRE-dELS). -</p> - -<p> -3. DNase-H3K4me3 cCREs have high H3K4me3 max-Z scores but low H3K27ac max-Z scores -and do not fall within 200 bp of a TSS. -</p> - -<p> -4. CTCF-only cCREs have high DNase and CTCF and low H3K4me3 and H3K27ac. -</p> - -<img style='margin-left: 40px;' height=229 width=371 src="../images/cCREgroups.png"> - -<p> -For further detail about the identification and classification of ENCODE cCREs see -the <i>About</i> page of the -<a target="_blank" href="https://screen.encodeproject.org">SCREEN</a> web tool. -</p> - -<h2>Data Access</h2> -<p> -The ENCODE accession numbers of the constituent datasets at the -<a target="_blank" href="https://encodeproject.org">ENCODE Portal</a> -are available from the cCRE details page. -</p> -<p> -The data in this track can be interactively explored with the -<a href="../cgi-bin/hgTables">Table Browser</a> or the -<a href="../cgi-bin/hgIntegrator">Data Integrator</a>. -The data can be accessed from scripts through our -<a href="https://api.genome.ucsc.edu">API</a>, the track name is "encodeCcreCombined". - -<p> -For automated download and analysis, this annotation is stored in a bigBed file that -can be downloaded from -<a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/bbi/" target="_blank">our download server</a>. -The file for this track is called <tt>encodeCcreCombined.bb</tt>. -Individual regions or the whole genome annotation can be obtained using our tool -<tt>bigBedToBed</tt> which can be compiled from the source code or downloaded as a precompiled -binary for your system. -Instructions for downloading source code and binaries can be found -<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">here</a>. -The tool can also be used to obtain only features within a given range, e.g. -<tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg19/bbi/encodeCcreCombined.bb -chrom=chr21 -start=0 -end=100000000 stdout</tt></p> -</p> - -<h2>Credits</h2> -<p> -This dataset was produced by the ENCODE Data Analysis Center ( -<a target="_blank" href="https://www.umassmed.edu/zlab/">Zlab</a> - at UMass Medical Center). -Thanks to Henry Pratt, Jill Moore, Michael Purcaro, and Zhiping Weng, PI for providing -this data. -Thanks also to the ENCODE Consortium, the ENCODE production laboratories, -and the ENCODE Data Coordination Center for generating and processing the datasets used here. -</p> - -<h2>References</h2> -<p> -ENCODE Project Consortium.. -<a href="https://doi.org/10.1038/nature11247" target="_blank"> -An integrated encyclopedia of DNA elements in the human genome</a>. -<em>Nature</em>. 2012 Sep 6;489(7414):57-74. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/22955616" target="_blank">22955616</a>; PMC: <a -href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439153/" target="_blank">PMC3439153</a> -</p> - -ENCODE Project Consortium.. -<a href="http://dx.plos.org/10.1371/journal.pbio.1001046" target="_blank"> -A user's guide to the encyclopedia of DNA elements (ENCODE)</a>. -<em>PLoS Biol</em>. 2011 Apr;9(4):e1001046. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/21526222" target="_blank">21526222</a>; PMC: <a -href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3079585/" target="_blank">PMC3079585</a> -</p> -