e00c4ac98aed323e5dba0241250e4ee16369b573 jnavarr5 Mon Jun 3 13:58:55 2024 -0700 Combining all JASPAR tracks to use the same trackDb statement so we can setup automatic updates, refs #32537 diff --git src/hg/makeDb/trackDb/jaspar.html src/hg/makeDb/trackDb/jaspar.html new file mode 100644 index 0000000..c430c75 --- /dev/null +++ src/hg/makeDb/trackDb/jaspar.html @@ -0,0 +1,312 @@ +<h2>Description</h2> +<p> +This track represents genome-wide predicted binding sites for TF +(transcription factor) binding profiles in the +<a href="https://jaspar.genereg.net/about/" target="_blank">JASPAR +CORE collection</a>. This open-source database contains a curated, non-redundant +set of binding profiles derived from published collections of experimentally +defined transcription factor binding sites for eukaryotes.</p> + +<h2>Display Conventions and Configuration</h2> +<p> +Shaded boxes represent predicted binding sites for each of the TF profiles +in the JASPAR CORE collection. The shading of the boxes indicates +the p-value of the profile's match to that position (scaled between +0-1000 scores, where 0 corresponds to a p-value of 1 and 1000 to a +p-value ≤ 10<sup>-10</sup>). Thus, the darker the shade, the +lower (better) the p-value.</p> + +<p> +The default view shows only predicted binding sites with scores of 400 or greater but +can be adjusted in the track settings. Multi-select filters allow viewing of +particular transcription factors. At window sizes of greater than +10,000 base pairs, this track turns to density graph mode. +Zoom to a smaller region and click into an item to see more detail.</p> + +<p> +<em>From <a href="../../FAQ/FAQformat.html#format1">BED format documentation</a>: + </em> +<table style="box-sizing: border-box; border-collapse: collapse; border-spacing: 0px; border: 2px solid gray; margin-top: 10px; margin-left: 15px; font-size: 13px; color: rgb(0, 0, 0); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial;"> + <tbody style="box-sizing: border-box;"> + <tr style="box-sizing: border-box;"> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">shade</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(226, 226, 226);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(198, 198, 198);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(170, 170, 170);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(141, 141, 141);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(113, 113, 113);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(85, 85, 85);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(56, 56, 56);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(28, 28, 28);"> </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(0, 0, 0);"> </td> + </tr> + <tr style="box-sizing: border-box;"> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">score in range</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">≤ 166</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">167-277</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">278-388</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">389-499</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">500-611</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">612-722</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">723-833</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">834-944</td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">≥ 945</td> + </tr> + </tbody> +</table> + +<p><em>Conversion table:</em></p> +<table border="2" style="padding: 10px; border: 1px solid black; border-collapse: collapse;"> + <tr> + <td style="padding:10px"><strong>Item score</strong></td> + <td style="padding:10px">0</td> + <td style="padding:10px">100</td> + <td style="padding:10px">131</td> + <td style="padding:10px">200</td> + <td style="padding:10px">300</td> + <td style="padding:10px">400</td> + <td style="padding:10px">500</td> + <td style="padding:10px">600</td> + <td style="padding:10px">700</td> + <td style="padding:10px">800</td> + <td style="padding:10px">900</td> + <td style="padding:10px">1000</td> + </tr> + <tr> + <td style="padding:10px"><strong>p-value</strong></td> + <td style="padding:10px">1</td> + <td style="padding:10px">0.1</td> + <td style="padding:10px">0.049</td> + <td style="padding:10px">10<sup>-2</sup></td> + <td style="padding:10px">10<sup>-3</sup></td> + <td style="padding:10px">10<sup>-4</sup></td> + <td style="padding:10px">10<sup>-5</sup></td> + <td style="padding:10px">10<sup>-6</sup></td> + <td style="padding:10px">10<sup>-7</sup></td> + <td style="padding:10px">10<sup>-8</sup></td> + <td style="padding:10px">10<sup>-9</sup></td> + <td style="padding:10px">≤ 10<sup>-10</sup></td> + </tr> +</table> + +<h2>Methods</h2> +<p> +The JASPAR 2024 update expanded the JASPAR CORE collection by 20% (329 added and 72 upgraded +profiles). The new profiles were introduced after manual curation, in which 26 629 TF binding +motifs were curated and obtained as PFMs or discovered from ChIP-seq/-exo or DAP-seq data. 2500 +profiles from JASPAR 2022 were revised to either promote them to the CORE collection, update the +associated metadata, or remove them because of validation inconsistencies or poor quality. The +JASPAR database stores and focuses mostly on PFMs as the model of choice for TF-DNA interactions. +More information on the methods can be found in the +<a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkad1059" target="_blank"> +JASPAR 2024 publication</a> or on the +<a href="https://jaspar.genereg.net/" target="_blank">JASPAR website</a>.</p> + +<p> +JASPAR 2022 contains updated transcription factor binding sites +with additional transcription factor profiles. More information on the methods can be found in the +<a href="https://www.ncbi.nlm.nih.gov/pubmed/34850907" target="_blank"> +JASPAR 2022 publication</a> +JASPAR 2022 publication or on the +<a href="https://jaspar.genereg.net/" target="_blank">JASPAR website</a>.</p> + +<p> +JASPAR 2020 scanned DNA sequences with JASPAR CORE TF-binding profiles +for each taxa independently using PWMScan. TFBS predictions were selected with +a PWM relative score ≥ 0.8 and a p-value < 0.05. P-values were scaled +between 0 (corresponding to a p-value of 1) and 1000 (p-value ≤ 10<sup>-10</sup>) for +coloring of the genome tracks and to allow for comparison of prediction +confidence between different profiles.</p> + +<p> +JASPAR 2018 used the TFBS Perl module (Lenhard and Wasserman 2002) +and FIMO (Grant, Bailey, and Noble 2011), as distributed within the MEME suite +(version 4.11.2) (Bailey <em>et al.</em> 2009). For scanning genomes with the +BioPerl TFBS module, profiles were converted to PWMs and matches were kept with a +relative score ≥ 0.8. For the FIMO scan, profiles were reformatted to MEME motifs +and matches with a p-value < 0.05 were kept. TFBS predictions that were not +consistent between the two methods (TFBS Perl module and FIMO) were removed. The +remaining TFBS predictions were colored according +to their FIMO p-value to allow for comparison of prediction confidence between +different profiles.</p> + +<p> +Please refer to the JASPAR 2024, 2022, 2020, and 2018 publications for more +details (citation below).</p> + +<h2>Data Access</h2> +<p> +JASPAR Transcription Factor Binding data includes billions of items. Limited regions can +be explored interactively with the +<a href="../cgi-bin/hgTables">Table Browser</a> and cross-referenced with +<a href="../cgi-bin/hgIntegrator">Data Integrator</a>, although positional +queries that are too big can lead to timing out. This results in a black page +or truncated output. In this case, you may try reducing the chromosomal query to +a smaller window.</p> +<p> +For programmatic access, +the track can be accessed using the Genome Browser's +<a href="../../goldenPath/help/api.html">REST API</a>. +JASPAR annotations can be downloaded from the +<a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar">Genome Browser's download server</a> +as a bigBed file. This compressed binary format can be remotely queried through +command line utilities. Please note that some of the download files can be quite large.</p> +<p> +The utilities for working with bigBed-formatted binary files can be downloaded +<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads" + target=_blank>here</a>. +Run a utility with no arguments to see a brief description of the utility and its options. +<ul> + <li><b>bigBedInfo</b> provides summary statistics about a bigBed file including the number of + items in the file. With the <b>-as</b> option, the output includes an + autoSql + definition of data columns, useful for interpreting the column values.</li> + <li><b>bigBedToBed</b> converts the binary bigBed data to tab-separated text. + Output can be restricted to a particular region by using the -chrom, -start + and -end options.</li> +</ul> +</p> + +<h4>Example: retrieve all JASPAR items in chr1:200001-200400</h4> + +<pre><tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar/JASPAR2024.bb -chrom=chr1 -start=200000 -end=200400 stdout</tt></pre> + +<p> +All data are freely available. +Additional resources are available directly from the JASPAR group:</p> +<ul> +<li>Binding site predictions for all and individual TF profiles are available +for download at +<a href="http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/" +target="_blank">http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/</a>.</li> +<li>Code and data used to create the UCSC tracks are available at +<a href="https://github.com/wassermanlab/JASPAR-UCSC-tracks" target="_blank"> +https://github.com/wassermanlab/JASPAR-UCSC-tracks</a>.</li> +<li>The underlying JASPAR motif data is available through the JASPAR website at +<a href="https://jaspar.genereg.net/" target="_blank">https://jaspar.genereg.net/</a>.</li> +</ul> + +<h2>Other Genomes</h2> +<p>The JASPAR group provides TFBS predictions for many additional species and +genomes, accessible by connection to their +<a href="../cgi-bin/hgHubConnect?hubSearchTerms=jaspar&hgHub_do_search=on"> +Public Hub</a> or by clicking the assembly links below:</p> +<table width="458" border="1"> + <tbody> + <tr> + <td><strong>Species</strong></td> + <td><strong>Genome assembly versions</strong></td> + </tr> + <tr> + <td width="300">Human - <em>Homo sapiens</em></td> + <td width="200"><a target="_blank" href="../cgi-bin/hgTrackUi?db=hg19&g=jaspar">hg19</a>, +<a target="_blank" href="../cgi-bin/hgTrackUi?db=hg38&g=jaspar">hg38</a></td> + </tr> + <tr> + <td>Mouse - <em>Mus musculus</em></td> + <td><a target="_blank" href="../cgi-bin/hgTrackUi?db=mm10&g=jaspar">mm10</a>, +<a target="_blank" href="../cgi-bin/hgTrackUi?db=mm39&g=jaspar">mm39</a></td> + </tr> + <tr> + <td>Zebrafish - <em>Danio rerio</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=danRer11&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">danRer11</a></td> + </tr> + <tr> + <td>Fruitfly - <em>Drosophila melanogaster</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=dm6&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">dm6</a></td> + </tr> + <tr> + <td>Nematode - <em>Caenorhabditis elegans</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=ce10&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">ce10</a>, + <a target="_blank" +href="../cgi-bin/hgTracks?db=ce11&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">ce11</a></td> + </tr> + <tr> + <td>Vase tunicate - <em>Ciona intestinalis</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=ci3&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">ci3</a></td> + </tr> + <tr> + <td>Thale cress - <em>Arabidopsis thaliana</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt&genome=araTha1">araTha1</a></td> + </tr> + <tr> + <td>Yeast - <em>Saccharomyces cerevisiae</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=sacCer3&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">sacCer3</a></td> + </tr> + </tbody> +</table> + +<h2>Credits</h2> +<p> +The JASPAR database is a joint effort between several labs +(please see the latest JASPAR paper, below). +Binding site predictions and UCSC tracks were computed by the Wasserman Lab. For +enquiries about the data please contact Oriol Fornes +(<A HREF="mailto:oriol@cmmt. +ubc.ca"> +oriol@cmmt. +ubc.ca</A> +<!-- above address is oriol at cmmt.ubc.ca -->).</p> + +<blockquote> + <p><em><a href="http://cisreg.ca/">Wasserman Lab</a></em><br/> + Centre for Molecular Medicine and Therapeutics<br/> + BC Children's Hospital Research Institute<br/> + Department of Medical Genetics<br/> + University of British Columbia<br/> + Vancouver, Canada + </p> +</blockquote> + +<h2>References</h2> +<p> +Castro-Mondragon JA, Riudavets-Puig R, Rauluseviciute I, Berhanu Lemma R, Turchi L, Blanc-Mathieu R, +Lucas J, Boddie P, Khan A, Manosalva Pérez N <em>et al</em>. +<a href="https://www.ncbi.nlm.nih.gov/pubmed/34850907" target="_blank"> +JASPAR 2022: the 9th release of the open-access database of transcription factor binding +profiles</a>. +<em>Nucleic Acids Res</em>. 2021 Nov 30;. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/34850907" target="_blank">34850907</a> +</p> + +<p> +Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, +Modi BP, Correard S, Gheorghe M, Baranašić D <em>et al</em>. +<a href="https://www.ncbi.nlm.nih.gov/pubmed/31701148" target="_blank"> +JASPAR 2020: update of the open-access database of transcription factor +binding profiles</a>. +<em>Nucleic Acids Res</em>. 2020 Jan 8;48(D1):D87-D92. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/31701148" +target="_blank">31701148</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145627/" +target="_blank">PMC7145627</a> +</p> + +<p> +Khan A, Fornes O, Stigliani A, Gheorghe M, Castro-Mondragon JA, van der Lee R, +Bessy A, Chèneby J, Kulkarni SR, Tan G <em>et al</em>. +<a href="https://www.ncbi.nlm.nih.gov/pubmed/29140473" target="_blank"> +JASPAR 2018: update of the open-access database of transcription factor +binding profiles and its web framework</a>. +<em>Nucleic Acids Res</em>. 2018 Jan 4;46(D1):D260-D266. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/29140473" +target="_blank">29140473</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753243/" +target="_blank">PMC5753243</a> +</p> + +<p> +Rauluseviciute I, Riudavets-Puig R, Blanc-Mathieu R, Castro-Mondragon JA, Ferenc K, Kumar V, Lemma +RB, Lucas J, Chèneby J, Baranasic D <em>et al</em>. +<a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkad1059" target="_blank"> +JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding +profiles</a>. +<em>Nucleic Acids Res</em>. 2023 Nov 14;. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37962376" target="_blank">37962376</a> +</p>