290ecad2fd65b22099441fdc740cf8f7ea163009 dschmelt Fri Feb 19 14:48:22 2021 -0800 Adding data access refs #27021 diff --git src/hg/makeDb/trackDb/human/encRegTfbsClustered.html src/hg/makeDb/trackDb/human/encRegTfbsClustered.html index 6767c1a..1790455 100644 --- src/hg/makeDb/trackDb/human/encRegTfbsClustered.html +++ src/hg/makeDb/trackDb/human/encRegTfbsClustered.html @@ -35,40 +35,52 @@ cell types contributing to the cluster and how many cell types were assayed for the factor (count where detected / count where assayed). For brevity in the display, each cell type is abbreviated to a single letter. The darkness of the letter is proportional to the signal strength observed in the cell line. Abbreviations starting with capital letters designate ENCODE cell types initially identified for intensive study, while those starting with lowercase letters designate cell lines added later in the project.

Click on a peak cluster to see more information about the TF/cell assays contributing to the cluster and the cell line abbreviation table.

Methods

Peaks of transcription factor occupancy ("optimal peak set") from ENCODE ChIP-seq datasets were clustered using the UCSC hgBedsToBedExps tool. Scores were assigned to peaks by multiplying the input signal values by a normalization factor calculated as the ratio of the maximum score value (1000) to the signal value at one standard deviation from the mean, with values exceeding 1000 capped at 1000. This has the effect of distributing scores up to mean plus one 1 standard deviation across the score range, but assigning all above to the maximum score. The cluster score is the highest score for any peak contributing to the cluster.

Data Access

+The raw data for the ENCODE3 TF Clusters track can be accessed from + +Table Browser or combined with other data-sets through +Data Integrator. This data is stored internally as a BED5+1 MySQL table with additional +metadata tables. For automated analysis and download, the track data file can be downloaded +from our + +downloads server. It can also be queried using the +JSON API or the +Public SQL commands.

Credits

Thanks to the ENCODE Consortium, the ENCODE ChIP-seq production laboratories, and the ENCODE Data Coordination Center for generating and processing the TF ChIP-seq datasets used here. The ENCODE accession numbers of the constituent datasets are available from the peak details page. Special thanks to Henry Pratt, Jill Moore, Michael Purcaro, and Zhiping Weng, PI, at the ENCODE Data Analysis Center (ZLab at UMass Medical Center) for providing the peak datasets, metadata, and guidance developing this track.

The integrative view presented here was developed by Jim Kent at UCSC.

References

ENCODE Project Consortium.