c13042f97958c80572df64988506484749ff46af dschmelt Mon Aug 23 15:30:41 2021 -0700 Final Dev staging for JASPAR tracks refs #27061 diff --git src/hg/makeDb/trackDb/human/jaspar.html src/hg/makeDb/trackDb/human/jaspar.html index 0764aff..5dd852c 100644 --- src/hg/makeDb/trackDb/human/jaspar.html +++ src/hg/makeDb/trackDb/human/jaspar.html @@ -1,165 +1,248 @@ <h2>Description</h2> -<p>This UCSC Genome Browser track represents genome-wide predicted binding sites for TF +<p>This track represents genome-wide predicted binding sites for TF (transcription factor) binding profiles in the -<a href="http://jaspar.genereg.net/genome-tracks/" target="_blank">JASPAR</a> database CORE collection.</p> +<a href="http://jaspar.genereg.net/about/" target="_blank">JASPAR +CORE collection</a>. This open-source database contains a curated, non-redundant +set of binding profiles derived from published collections of experimentally +defined transcription factor binding sites for eukaryotes.</p> <h2>Display Conventions and Configuration</h2> <p> -Shaded boxes represent predicted binding sites for each of the TF profiles in the JASPAR CORE collection. +Shaded boxes represent predicted binding sites for each of the TF profiles in the +JASPAR CORE collection. + +The shading of the boxes indicates the p-value of the profile's match to that position +(scaled between 0-1000 scores, where 0 corresponds to a p-value of 1 and 1000 to a +p-value ≤ 10<sup>-10</sup>). Thus, the darker the shade, the lower (better) the p-value.</p> + +<p>The default view only shows predicted binding sites with scores of 400 or greater. +This can be adjusted by adjusting the track settings.</p> -The shading of the boxes indicates the p-value of the profile's match to that position (scaled between 0-1000 scores, where 0 corresponds to a p-value of 1 and 1000 to a p-value ≤ 10<sup>-10</sup>). Thus, the darker the shade, the lower (better) the p-value.</p> -<p>The default view only shows predicted binding sites with scores of 400 or greater. This can be adjusted by adjusting the track settings.</p> <p><em>From <a href="https://genome.ucsc.edu/FAQ/FAQformat.html#format1">https://genome.ucsc.edu/FAQ/FAQformat.html#format1</a>: </em> <table style="box-sizing: border-box; border-collapse: collapse; border-spacing: 0px; border: 2px solid gray; margin-top: 10px; margin-left: 15px; font-size: 13px; color: rgb(0, 0, 0); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial;"> <tbody style="box-sizing: border-box;"> <tr style="box-sizing: border-box;"> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">shade</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(226, 226, 226);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(198, 198, 198);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(170, 170, 170);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(141, 141, 141);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(113, 113, 113);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(85, 85, 85);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(56, 56, 56);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(28, 28, 28);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(0, 0, 0);"> </td> </tr> <tr style="box-sizing: border-box;"> - <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">score in range </td> + <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">score in range</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">≤ 166</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">167-277</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">278-388</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">389-499</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">500-611</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">612-722</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">723-833</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">834-944</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">≥ 945</td> </tr> </tbody> </table> + <p><em>Conversion table:</em></p> <table cellspacing="0" cellpadding="0"> <col width="87" /> <col width="87" /> <tr> <td width="107"><div align="center"><strong>Track score</strong></div></td> <td width="107"><div align="center"><strong>p-value</strong></div></td> </tr> <tr> <td align="right"><div align="center">0</div></td> <td><div align="center">1</div></td> </tr> <tr> <td align="right"><div align="center">100</div></td> <td><div align="center">0.1</div></td> </tr> <tr> <td align="right"><div align="center">131</div></td> <td><div align="center">0.049</div></td> </tr> <tr> <td align="right"><div align="center">200</div></td> <td><div align="center">10<sup>-2</sup></div></td> </tr> <tr> <td align="right"><div align="center">300</div></td> <td><div align="center">10<sup>-3</sup></div></td> </tr> <tr> <td align="right"><div align="center">400</div></td> <td><div align="center">10<sup>-4</sup></div></td> </tr> <tr> <td align="right"><div align="center">500</div></td> <td><div align="center">10<sup>-5</sup></div></td> </tr> <tr> <td align="right"><div align="center">600</div></td> <td><div align="center">10<sup>-6</sup></div></td> </tr> <tr> <td align="right"><div align="center">700</div></td> <td><div align="center">10<sup>-7</sup></div></td> </tr> <tr> <td align="right"><div align="center">800</div></td> <td><div align="center">10<sup>-8</sup></div></td> </tr> <tr> <td align="right"><div align="center">900</div></td> <td><div align="center">10<sup>-9</sup></div></td> </tr> <tr> <td align="right"><div align="center">1000</div></td> <td><div align="center">≤ 10<sup>-10</sup></div></td> </tr> </table> <h2>Methods</h2> -<p>For each TF binding profile in the JASPAR CORE collection, genomes were scanned in parallel using the TFBS Perl module (Lenhard and Wasserman 2002) and FIMO (Grant, Bailey, and Noble 2011), as distributed within the MEME suite (version 4.11.2) (Bailey et al. 2009). For scanning genomes with the BioPerl TFBS module, we converted profiles to PWMs and kept matches with a relative score ≥ 0.8. For the FIMO scan, profiles were reformatted to MEME motifs and matches with a p-value < 0.05 were kept. TFBS predictions that were not consistent between the two methods (TFBS Perl module and FIMO) were removed. The remaining TFBS predictions were converted to genome tracks and colored according to their FIMO p-value to allow for comparison of prediction confidence between different profiles.</p> -<p>Please refer to the JASPAR 2018 manuscript for more details (citation below).</p> +<p> +For each TF binding profile in the JASPAR CORE collection, genomes were +scanned in parallel using the TFBS Perl module (Lenhard and Wasserman 2002) +and FIMO (Grant, Bailey, and Noble 2011), as distributed within the MEME suite +(version 4.11.2) (Bailey et al. 2009).</p> +<p> +For scanning genomes with the +BioPerl TFBS module, we converted profiles to PWMs and kept matches with a +relative score ≥ 0.8. For the FIMO scan, profiles were reformatted to MEME motifs +and matches with a p-value < 0.05 were kept. TFBS predictions that were not +consistent between the two methods (TFBS Perl module and FIMO) were removed. The +remaining TFBS predictions were converted to genome tracks and colored according +to their FIMO p-value to allow for comparison of prediction confidence between +different profiles.</p> + +<p> +Please refer to the JASPAR 2020 and 2018 publication for more details (citation below).</p> <h2>Data Access</h2> +<p> +JASPAR Transcription Factor Binding data can be explored interactively with the +<a href="../cgi-bin/hgTables">Table Browser</a> and cross-referenced with +<a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For programmatic access, +the track can be accessed using the Genome Browser's +<a href="../../goldenPath/help/api.html">REST API</a>. +JASPAR annotations can be downloaded from the +<a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar">Genome Browser's download server</a> +as a bigBed file. This compressed binary format can be interacted with through +command line utilities. Please be aware that these download files are very large, +between 75Gb and 86Gb.</p> + +<p> +All data is freely available.</p> + +<p> +Additional resources are available directly from the JASPAR group:</p> <ul> -<li>Binding site predictions for all and individual TF profiles are available for download at <a href="http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2018/" target="_blank">http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2018/</a>.</li> -<li>Code and data used to create the UCSC tracks are available at <a href="https://github.com/wassermanlab/JASPAR-UCSC-tracks">https://github.com/wassermanlab/JASPAR-UCSC-tracks</a>.</li> -<li>The underlying JASPAR motif data is available through the JASPAR website at <a href="http://jaspar.genereg.net" target="_blank">http://jaspar.genereg.net</a>.</li> +<li>Binding site predictions for all and individual TF profiles are available +for download at +<a href="http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/" +target="_blank">http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/</a>.</li> +<li>Code and data used to create the UCSC tracks are available at +<a href="https://github.com/wassermanlab/JASPAR-UCSC-tracks"> +https://github.com/wassermanlab/JASPAR-UCSC-tracks</a>.</li> +<li>The underlying JASPAR motif data is available through the JASPAR website at +<a href="http://jaspar.genereg.net" target="_blank">http://jaspar.genereg.net</a>.</li> </ul> -<p> All data is freely available.</p> -<h2>Genomes</h2> -<p>The JASPAR group provides TFBS predictions for many additional species and genomes in their <a href="../cgi-bin/hgHubConnect?hubSearchTerms=jaspar&hgHub_do_search=on">Public Hub</a>:</p> +<h2>Other Genomes</h2> +<p>The JASPAR group provides TFBS predictions for many additional species and +genomes, accessible by connection to their +<a href="../cgi-bin/hgHubConnect?hubSearchTerms=jaspar&hgHub_do_search=on"> +Public Hub</a> or by clicking the assembly links below:</p> <table width="458" border="1"> <tbody> <tr> <td><strong>Species</strong></td> <td><strong>Genome assembly versions</strong></td> </tr> <tr> - <td width="200">Human</td> - <td width="200">hg19, hg38</td> + <td width="300">Human - <em>Homo sapiens</em></td> + <td width="200"><a target="_blank" href="../cgi-bin/hgTrackUi?db=hg19&g=jaspar">hg19</a>, <a target="_blank" href="../cgi-bin/hgTrackUi?db=hg38&g=jaspar">hg38</a></td> + </tr> + <tr> + <td>Mouse - <em>Mus musculus</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=mm10&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt"> +mm10</a></td> </tr> <tr> - <td><em>Danio rerio</em></td> - <td>danRer10</td> + <td>Zebrafish - <em>Danio rerio</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=danRer11&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">danRer11</a></td> </tr> <tr> - <td><em>Drosophila melanogaster</em></td> - <td>dm6</td> + <td>Fruitfly - <em>Drosophila melanogaster</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=dm6&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">dm6</a></td> </tr> <tr> - <td><em>Caenorhabditis elegans</em></td> - <td>ce10</td> + <td>Nematode - <em>Caenorhabditis elegans</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=ce10&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">ce10</a></td> </tr> <tr> - <td><em>Arabidopsis thaliana</em></td> - <td>araTha1</td> + <td>Thale cress - <em>Arabidopsis thaliana</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt&genome=araTha1">araTha1</a></td> </tr> <tr> - <td><em>Saccharomyces cerevisiae</em></td> - <td>sacCer3</td> + <td>Yeast - <em>Saccharomyces cerevisiae</em></td> + <td><a target="_blank" +href="../cgi-bin/hgTracks?db=sacCer3&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">sacCer3</a></td> </tr> </tbody> </table> <h2>Credits</h2> -<p>The JASPAR database is a joint effort between several labs (please see the latest JASPAR paper, below). Binding site predictions and UCSC tracks were computed by the Wasserman Lab and are maintained by David Arenillas (<a href="mailto:dave@cmmt.ubc.ca" target="_blank">dave@cmmt.ubc.ca</a>). For enquiries about the data please contact Oriol Fornes (<a href="mailto:oriol@cmmt.ubc.ca" target="_blank">oriol@cmmt.ubc.ca</a>) and Robin van der Lee (<a href="mailto:rvdlee@cmmt.ubc.ca" target="_blank">rvdlee@cmmt.ubc.ca</a>). +<p> +The JASPAR database is a joint effort between several labs +(please see the latest JASPAR paper, below). +Binding site predictions and UCSC tracks were computed by the Wasserman Lab and +are maintained by David Arenillas +(<a href="mailto:dave@cmmt.ubc.ca" target="_blank">dave@cmmt.ubc.ca</a>). For +enquiries about the data please contact Oriol Fornes (<a href="mailto:oriol@cmmt.ubc.ca" +target="_blank">oriol@cmmt.ubc.ca</a>) and Robin van der Lee +(<a href="mailto:rvdlee@cmmt.ubc.ca" target="_blank">rvdlee@cmmt.ubc.ca</a>). </p> + <blockquote> <p><em><a href="http://www.cmmt.ubc.ca/wasserman-lab/">Wasserman Lab</a></em><br/> Centre for Molecular Medicine and Therapeutics<br/> BC Children's Hospital Research Institute<br/> Department of Medical Genetics<br/> University of British Columbia<br/> Vancouver, Canada </p> </blockquote> <h2>References</h2> <p> -A Khan, O Fornes, A Stigliani, A Bessy, M Gheorghe, R van der Lee, J A Castro-Mondragon, J Cheneby, S R Kulkarni, G Tan, D Baranasic, D J Arenillas, A Sandelin, K Vandepoele, B Lenhard, B Ballester, W W Wasserman, F Parcy, A Mathelier. <strong>JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework</strong>. <em>Nucleic Acids Research</em>. 2018. <a href="https://www.ncbi.nlm.nih.gov/pubmed/29140473">PMID 29140473</a></p> +Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, Modi BP, +Correard S, Gheorghe M, Baranasic D, Santana-Garcia W, Tan G, Cheneby J, Ballester B, + Parcy F, Sandelin A, Lenhard B, Wasserman WW, Mathelier A. +<strong>JASPAR 2020: update of the open-access database of transcription factor binding +profiles.</strong> <em>Nucleic Acids Res.</em> 2020 . +<a href="https://pubmed.ncbi.nlm.nih.gov/31701148/" target="_blank">PMID 31701148</a></p> + +<p> +Khan A, Fornes O, Stigliani A, Gheorghe M, Castro-Mondragon JA, van der Lee R, +Bessy A, Cheneby J, Kulkarni SR, Tan G, Baranasic D, Arenillas DJ, Sandelin A, +Vandepoele K, Lenhard B, Ballester B, Wasserman WW, Parcy F, Mathelier A. +<strong>JASPAR 2018: update of the open-access database of transcription factor +binding profiles and its web framework</strong>. <em>Nucleic Acids Research</em>. +2018. <a href="https://www.ncbi.nlm.nih.gov/pubmed/29140473">PMID 29140473</a></p>