2bde85a34b8accbea0c60266f90af25d911cee12 gperez2 Fri Mar 1 15:23:12 2024 -0800 Small edit to human/jaspar.html and major edits to mouse/jaspar.html diff --git src/hg/makeDb/trackDb/mouse/jaspar.html src/hg/makeDb/trackDb/mouse/jaspar.html index 5e67510..c430c75 100644 --- src/hg/makeDb/trackDb/mouse/jaspar.html +++ src/hg/makeDb/trackDb/mouse/jaspar.html @@ -5,31 +5,31 @@ <a href="https://jaspar.genereg.net/about/" target="_blank">JASPAR CORE collection</a>. This open-source database contains a curated, non-redundant set of binding profiles derived from published collections of experimentally defined transcription factor binding sites for eukaryotes.</p> <h2>Display Conventions and Configuration</h2> <p> Shaded boxes represent predicted binding sites for each of the TF profiles in the JASPAR CORE collection. The shading of the boxes indicates the p-value of the profile's match to that position (scaled between 0-1000 scores, where 0 corresponds to a p-value of 1 and 1000 to a p-value ≤ 10<sup>-10</sup>). Thus, the darker the shade, the lower (better) the p-value.</p> <p> -The default view only shows predicted binding sites with scores of 400 or greater but +The default view shows only predicted binding sites with scores of 400 or greater but can be adjusted in the track settings. Multi-select filters allow viewing of particular transcription factors. At window sizes of greater than 10,000 base pairs, this track turns to density graph mode. Zoom to a smaller region and click into an item to see more detail.</p> <p> <em>From <a href="../../FAQ/FAQformat.html#format1">BED format documentation</a>: </em> <table style="box-sizing: border-box; border-collapse: collapse; border-spacing: 0px; border: 2px solid gray; margin-top: 10px; margin-left: 15px; font-size: 13px; color: rgb(0, 0, 0); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial;"> <tbody style="box-sizing: border-box;"> <tr style="box-sizing: border-box;"> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221);">shade</td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(226, 226, 226);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(198, 198, 198);"> </td> <td style="box-sizing: border-box; padding: 1px 15px; text-align: left; border-bottom: 1px solid rgb(221, 221, 221); background-color: rgb(170, 170, 170);"> </td> @@ -79,50 +79,109 @@ <td style="padding:10px">0.049</td> <td style="padding:10px">10<sup>-2</sup></td> <td style="padding:10px">10<sup>-3</sup></td> <td style="padding:10px">10<sup>-4</sup></td> <td style="padding:10px">10<sup>-5</sup></td> <td style="padding:10px">10<sup>-6</sup></td> <td style="padding:10px">10<sup>-7</sup></td> <td style="padding:10px">10<sup>-8</sup></td> <td style="padding:10px">10<sup>-9</sup></td> <td style="padding:10px">≤ 10<sup>-10</sup></td> </tr> </table> <h2>Methods</h2> <p> -JASPAR contains transcription factor binding sites -with additional transcription factor profiles. TFBS predictions were selected with +The JASPAR 2024 update expanded the JASPAR CORE collection by 20% (329 added and 72 upgraded +profiles). The new profiles were introduced after manual curation, in which 26 629 TF binding +motifs were curated and obtained as PFMs or discovered from ChIP-seq/-exo or DAP-seq data. 2500 +profiles from JASPAR 2022 were revised to either promote them to the CORE collection, update the +associated metadata, or remove them because of validation inconsistencies or poor quality. The +JASPAR database stores and focuses mostly on PFMs as the model of choice for TF-DNA interactions. +More information on the methods can be found in the +<a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkad1059" target="_blank"> +JASPAR 2024 publication</a> or on the +<a href="https://jaspar.genereg.net/" target="_blank">JASPAR website</a>.</p> + +<p> +JASPAR 2022 contains updated transcription factor binding sites +with additional transcription factor profiles. More information on the methods can be found in the +<a href="https://www.ncbi.nlm.nih.gov/pubmed/34850907" target="_blank"> +JASPAR 2022 publication</a> +JASPAR 2022 publication or on the +<a href="https://jaspar.genereg.net/" target="_blank">JASPAR website</a>.</p> + +<p> +JASPAR 2020 scanned DNA sequences with JASPAR CORE TF-binding profiles +for each taxa independently using PWMScan. TFBS predictions were selected with a PWM relative score ≥ 0.8 and a p-value < 0.05. P-values were scaled between 0 (corresponding to a p-value of 1) and 1000 (p-value ≤ 10<sup>-10</sup>) for coloring of the genome tracks and to allow for comparison of prediction -confidence between different profiles. More information on -the methods can be found in their publications or on the -<a href="https://jaspar.genereg.net/" target="_blank">JASPAR website</a>.</p> +confidence between different profiles.</p> + +<p> +JASPAR 2018 used the TFBS Perl module (Lenhard and Wasserman 2002) +and FIMO (Grant, Bailey, and Noble 2011), as distributed within the MEME suite +(version 4.11.2) (Bailey <em>et al.</em> 2009). For scanning genomes with the +BioPerl TFBS module, profiles were converted to PWMs and matches were kept with a +relative score ≥ 0.8. For the FIMO scan, profiles were reformatted to MEME motifs +and matches with a p-value < 0.05 were kept. TFBS predictions that were not +consistent between the two methods (TFBS Perl module and FIMO) were removed. The +remaining TFBS predictions were colored according +to their FIMO p-value to allow for comparison of prediction confidence between +different profiles.</p> + +<p> +Please refer to the JASPAR 2024, 2022, 2020, and 2018 publications for more +details (citation below).</p> <h2>Data Access</h2> <p> -JASPAR Transcription Factor Binding data can be explored interactively with the +JASPAR Transcription Factor Binding data includes billions of items. Limited regions can +be explored interactively with the <a href="../cgi-bin/hgTables">Table Browser</a> and cross-referenced with -<a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For programmatic access, +<a href="../cgi-bin/hgIntegrator">Data Integrator</a>, although positional +queries that are too big can lead to timing out. This results in a black page +or truncated output. In this case, you may try reducing the chromosomal query to +a smaller window.</p> +<p> +For programmatic access, the track can be accessed using the Genome Browser's <a href="../../goldenPath/help/api.html">REST API</a>. JASPAR annotations can be downloaded from the <a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar">Genome Browser's download server</a> as a bigBed file. This compressed binary format can be remotely queried through command line utilities. Please note that some of the download files can be quite large.</p> +<p> +The utilities for working with bigBed-formatted binary files can be downloaded +<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads" + target=_blank>here</a>. +Run a utility with no arguments to see a brief description of the utility and its options. +<ul> + <li><b>bigBedInfo</b> provides summary statistics about a bigBed file including the number of + items in the file. With the <b>-as</b> option, the output includes an + autoSql + definition of data columns, useful for interpreting the column values.</li> + <li><b>bigBedToBed</b> converts the binary bigBed data to tab-separated text. + Output can be restricted to a particular region by using the -chrom, -start + and -end options.</li> +</ul> +</p> + +<h4>Example: retrieve all JASPAR items in chr1:200001-200400</h4> + +<pre><tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/$db/jaspar/JASPAR2024.bb -chrom=chr1 -start=200000 -end=200400 stdout</tt></pre> <p> All data are freely available. Additional resources are available directly from the JASPAR group:</p> <ul> <li>Binding site predictions for all and individual TF profiles are available for download at <a href="http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/" target="_blank">http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/</a>.</li> <li>Code and data used to create the UCSC tracks are available at <a href="https://github.com/wassermanlab/JASPAR-UCSC-tracks" target="_blank"> https://github.com/wassermanlab/JASPAR-UCSC-tracks</a>.</li> <li>The underlying JASPAR motif data is available through the JASPAR website at <a href="https://jaspar.genereg.net/" target="_blank">https://jaspar.genereg.net/</a>.</li> </ul> @@ -133,34 +192,32 @@ <a href="../cgi-bin/hgHubConnect?hubSearchTerms=jaspar&hgHub_do_search=on"> Public Hub</a> or by clicking the assembly links below:</p> <table width="458" border="1"> <tbody> <tr> <td><strong>Species</strong></td> <td><strong>Genome assembly versions</strong></td> </tr> <tr> <td width="300">Human - <em>Homo sapiens</em></td> <td width="200"><a target="_blank" href="../cgi-bin/hgTrackUi?db=hg19&g=jaspar">hg19</a>, <a target="_blank" href="../cgi-bin/hgTrackUi?db=hg38&g=jaspar">hg38</a></td> </tr> <tr> <td>Mouse - <em>Mus musculus</em></td> - <td> -<a target="_blank" href="../cgi-bin/hgTrackUi?db=mm10&g=jaspar"> -mm10</a>, <a target="_blank" href="../cgi-bin/hgTrackUi?db=mm39&g=jaspar"> -mm39</a></td> + <td><a target="_blank" href="../cgi-bin/hgTrackUi?db=mm10&g=jaspar">mm10</a>, +<a target="_blank" href="../cgi-bin/hgTrackUi?db=mm39&g=jaspar">mm39</a></td> </tr> <tr> <td>Zebrafish - <em>Danio rerio</em></td> <td><a target="_blank" href="../cgi-bin/hgTracks?db=danRer11&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">danRer11</a></td> </tr> <tr> <td>Fruitfly - <em>Drosophila melanogaster</em></td> <td><a target="_blank" href="../cgi-bin/hgTracks?db=dm6&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">dm6</a></td> </tr> <tr> <td>Nematode - <em>Caenorhabditis elegans</em></td> <td><a target="_blank" href="../cgi-bin/hgTracks?db=ce10&hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt">ce10</a>, @@ -218,15 +275,38 @@ PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/34850907" target="_blank">34850907</a> </p> <p> Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, Modi BP, Correard S, Gheorghe M, Baranašić D <em>et al</em>. <a href="https://www.ncbi.nlm.nih.gov/pubmed/31701148" target="_blank"> JASPAR 2020: update of the open-access database of transcription factor binding profiles</a>. <em>Nucleic Acids Res</em>. 2020 Jan 8;48(D1):D87-D92. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/31701148" target="_blank">31701148</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145627/" target="_blank">PMC7145627</a> </p> + +<p> +Khan A, Fornes O, Stigliani A, Gheorghe M, Castro-Mondragon JA, van der Lee R, +Bessy A, Chèneby J, Kulkarni SR, Tan G <em>et al</em>. +<a href="https://www.ncbi.nlm.nih.gov/pubmed/29140473" target="_blank"> +JASPAR 2018: update of the open-access database of transcription factor +binding profiles and its web framework</a>. +<em>Nucleic Acids Res</em>. 2018 Jan 4;46(D1):D260-D266. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/29140473" +target="_blank">29140473</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753243/" +target="_blank">PMC5753243</a> +</p> + +<p> +Rauluseviciute I, Riudavets-Puig R, Blanc-Mathieu R, Castro-Mondragon JA, Ferenc K, Kumar V, Lemma +RB, Lucas J, Chèneby J, Baranasic D <em>et al</em>. +<a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkad1059" target="_blank"> +JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding +profiles</a>. +<em>Nucleic Acids Res</em>. 2023 Nov 14;. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37962376" target="_blank">37962376</a> +</p>