63016a5603d71b06864a89f83b97921feef7f1e6 lrnassar Mon Jul 18 15:08:49 2022 -0700 Removing whole genome queries for JASPAR and expanding the data access section, refs #29697 diff --git src/hg/makeDb/trackDb/human/jaspar.html src/hg/makeDb/trackDb/human/jaspar.html index 5466718..0533f24 100644 --- src/hg/makeDb/trackDb/human/jaspar.html +++ src/hg/makeDb/trackDb/human/jaspar.html @@ -110,39 +110,64 @@ (version 4.11.2) (Bailey <em>et al.</em> 2009). For scanning genomes with the BioPerl TFBS module, profiles were converted to PWMs and matches were kept with a relative score ≥ 0.8. For the FIMO scan, profiles were reformatted to MEME motifs and matches with a p-value < 0.05 were kept. TFBS predictions that were not consistent between the two methods (TFBS Perl module and FIMO) were removed. The remaining TFBS predictions were colored according to their FIMO p-value to allow for comparison of prediction confidence between different profiles.</p> <p> Please refer to the JASPAR 2022, 2020, and 2018 publications for more details (citation below).</p> <h2>Data Access</h2> <p> -JASPAR Transcription Factor Binding data can be explored interactively with the +JASPAR Transcription Factor Binding data includes billions of items. Limited regions can +be explored interactively with the <a href="../cgi-bin/hgTables">Table Browser</a> and cross-referenced with -<a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For programmatic access, +<a href="../cgi-bin/hgIntegrator">Data Integrator</a>, although positional +queries that are too big can lead to timing out, resulting in a black page +or truncated output. In this case you may try reducing the chromosomal query to +a smaller window.</p> +<p> +For programmatic access, the track can be accessed using the Genome Browser's <a href="../../goldenPath/help/api.html">REST API</a>. JASPAR annotations can be downloaded from the <a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar">Genome Browser's download server</a> as a bigBed file. This compressed binary format can be remotely queried through command line utilities. Please note that some of the download files can be quite large.</p> +<p> +The utilities for working with bigBed-formatted binary files can be downloaded +<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads" + target=_blank>here</a>. +Run a utility with no arguments to see a brief description of the utility and its options. +<ul> + <li><b>bigBedInfo</b> provides summary statistics about a bigBed file including the number of + items in the file. With the <b>-as</b> option, the output includes an + autoSql + definition of data columns, useful for interpreting the column values.</li> + <li><b>bigBedToBed</b> converts the binary bigBed data to tab-separated text. + Output can be restricted to a particular region by using the -chrom, -start + and -end options.</li> +</ul> +</p> + +<h4>Example: retrieve all JASPAR items in chr1:200001-200400</h4> + +<pre><tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar/JASPAR2022.bb -chrom=chr1 -start=200000 -end=200400 stdout</tt></pre> <p> All data are freely available. Additional resources are available directly from the JASPAR group:</p> <ul> <li>Binding site predictions for all and individual TF profiles are available for download at <a href="http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/" target="_blank">http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/</a>.</li> <li>Code and data used to create the UCSC tracks are available at <a href="https://github.com/wassermanlab/JASPAR-UCSC-tracks" target="_blank"> https://github.com/wassermanlab/JASPAR-UCSC-tracks</a>.</li> <li>The underlying JASPAR motif data is available through the JASPAR website at <a href="https://jaspar.genereg.net/" target="_blank">https://jaspar.genereg.net/</a>.</li> </ul>