src/hg/makeDb/trackDb/crispr.html 75367bbe6bc818331a1c5f3259135a1cdba12b6f

75367bbe6bc818331a1c5f3259135a1cdba12b6f
jnavarr5
  Wed Jun 19 10:38:16 2019 -0700
Updating http to https for ce11, uiLinks cronjob.

diff --git src/hg/makeDb/trackDb/crispr.html src/hg/makeDb/trackDb/crispr.html
index 67ce867..82384d5 100644
--- src/hg/makeDb/trackDb/crispr.html
+++ src/hg/makeDb/trackDb/crispr.html
@@ -1,214 +1,214 @@
 <h2>Description</h2>
 
 <p>
 This track shows regions of the genome within 200 bp of transcribed regions and
 DNA sequences targetable by CRISPR RNA guides using the Cas9 enzyme
 from <em>S. pyogenes</em> (PAM: NGG).
 CRISPR target sites were annotated with predicted specificity
 (off-target effects) and predicted efficiency (on-target cleavage) by various
 algorithms through the tool <a href="http://crispor.tefor.net/"
 target="_blank">CRISPOR</a>.
 </p>
 
 <h2>Display Conventions and Configuration</h2>
 
 <p>
 The track &quot;CRISPR Regions&quot; shows the regions of the genome where target
 sites were analyzed, i.e. within 200 bp of transcribed regions as annotated by
 Ensembl transcript models.</p>
 
 <p>
 The track &quot;CRISPR Targets&quot; shows the target sites in these regions.
 The target sequence of the guide is shown with a thick (exon) bar. The PAM
 motif match (NGG) is shown with a thinner bar. Guides
 are colored to reflect both predicted specificity and efficiency. Specificity
 reflects the &quot;uniqueness&quot; of a 20mer sequence in the genome; the less unique a
 sequence is, the more likely it is to cleave other locations of the genome
 (off-target effects). Efficiency is the frequency of cleavage at the target
 site (on-target efficiency).</p>
 
 <p>Shades of gray stand for sites that are hard to target specifically, as the
 20mer is not very unique in the genome:</p>
 <table class="stdTbl" style="width:100%">
 <tr><td style="width:50px; background-color:#969696"></td><td>impossible to target: target site has at least one identical copy in the genome and was not scored</td></tr>
 <tr><td style="width:50px; background-color:#787878"></td><td>hard to target: many similar sequences in the genome that alignment stopped, repeat?</td></tr>
 <tr><td style="width:50px; background-color:#505050"></td><td>hard to target: target site was aligned but results in a low specificity score &lt;= 50 (see below)</td></tr>
 </table>
 
 <p>Colors highlight targets that are specific in the genome (MIT specificity &gt; 50) but have different predicted efficiencies:</p>
 <table class="stdTbl" style="width:100%">
 <tr><td style="width:50px; background-color:#000064"></td><td>unable to calculate Doench/Fusi 2016 efficiency score</td></tr>
 <tr><td style="width:50px; background-color:#FF7070"></td><td>low predicted cleavage: Doench/Fusi 2016 Efficiency percentile &lt;= 30</td></tr>
 <tr><td style="width:50px; background-color:#FFFF00"></td><td>medium predicted cleavage: Doench/Fusi 2016 Efficiency percentile &gt; 30 and &lt; 55</td></tr>
 <tr><td style="width:50px; background-color:#00b300"></td><td>high predicted cleavage: Doench/Fusi 2016 Efficiency &gt; 55</td></tr>
 </table><BR>
 
 <p>
 Mouse-over a target site to show predicted specificity and efficiency scores:<br>
 <ol>
 <li>The MIT Specificity score summarizes all off-targets into a single number from
 0-100. The higher the number, the fewer off-target effects are expected. We
 recommend guides with an MIT specificity &gt; 50.</li>
 <li>The efficiency score tries to predict if a guide leads to rather strong or
 weak cleavage. According to <a href="#References">(Haeussler et al. 2016)</a>, the <a
 href="https://portals.broadinstitute.org/gpp/public/analysis-tools/sgrna-design">Doench
 2016 Efficiency score</a> should be used to select the guide with the highest
 cleavage efficiency when expressing guides from RNA PolIII Promoters such as
 U6. Scores are given as percentiles, e.g. &quot;70%&quot; means that 70% of mammalian
 guides have a score equal or lower than this guide. The raw score number is
 also shown in parentheses after the percentile.</li>
 <li>The <a
-href="http://www.crisprscan.org/">Moreno-Mateos 2015 Efficiency
+href="https://www.crisprscan.org/">Moreno-Mateos 2015 Efficiency
 score</a> should be used instead of the Doench 2016 score when transcribing the
 guide in vitro with a T7 promoter, e.g. for injections in mouse, zebrafish or
 Xenopus embryos. The Moreno-Mateos score is given in percentiles and the raw value in parentheses, see the note above.</li> </ol>
 </p>
 
 <p>Click onto features to show all scores and predicted off-targets with up to
 four mismatches. The Out-of-Frame score by <a href="#References">Bae et al. 2014</a>
 is correlated with
 the probability that mutations induced by the guide RNA will disrupt the open
 reading frame. The authors recommend out-of-frame scores &gt; 66 to create
 knock-outs with a single guide efficiently.<p>
 
 <p>Off-target sites are sorted by the CFD score (<a href="https://www.nature.com/articles/nbt.3437"
 target="_blank">Doench et al. 2016</a>). 
 The higher the CFD score, the more likely there is off-target cleavage at that site. 
 Off-targets with a CFD score &lt; 0.023 are not shown on this page, but are availble  when 
 following the link to the external CRISPOR tool. 
 When compared against experimentally validated off-targets by 
 <a href="#References">Haeussler et al. 2016</a>, the large majority of predicted
 off-targets with CFD scores &lt; 0.023 were false-positives.</p>
 
 <h2>Methods</h2>
 
 <h3>Relationship between predictions and experimental data</h3>
 
 <p>
 Like most algorithms, the MIT specificity score is not always a perfect
 predictor of off-target effects. Despite low scores, many tested guides 
 caused few and/or weak off-target cleavage when tested with whole-genome assays
 (Figure 2 from <a href="#References">Haeussler
 et al. 2016</a>), as shown below, and the published data contains few data points
 with high specificity scores. Overall though, the assays showed that the higher
 the specificity score, the lower the off-target effects.</p>
 
 <img src="../images/crisprFig_mitScore.png">
 
 <p>Similarly, efficiency scoring is not very accurate: guides with low
 scores can be efficient and vice versa. As a general rule, however, the higher
 the score, the less likely that a guide is very inefficient. The
 following histograms illustrate, for each type of score, how the share of
 inefficient guides drops with increasing efficiency scores:
 </p>
 
 <img src="../images/crisprFig_effScores.png">
 
 <p>When reading this plot, keep in mind that both scores were evaluated on
 their own training data. Especially for the Moreno-Mateos score, the
 results are too optimistic, due to overfitting. When evaluated on independent
 datasets, the correlation of the prediction with other assays was around 25%
 lower, see <a href="#References">Haeussler et al. 2016</a>. At the time of
 writing, there is no independent dataset available yet to determine the
 Moreno-Mateos accuracy for each score percentile range.</p>
 
 <h3>Track methods</h3>
 <p>
 Exons as predicted by Ensembl Gene models were used, extended by 200 basepairs
 on each side, searched for the -NGG motif. Flanking 20mer guide sequences were
 aligned to the genome with BWA and scored with MIT Specificity scores using the
 command-line version of crispor.org.  Non-unique guide sequences were skipped.
 Flanking sequences were extracted from the genome and input for Crispor
 efficiency scoring, available from the <a
 href="http://crispor.tefor.net/downloads/">Crispor downloads page</a>, which
 includes the Doench 2016, Moreno-Mateos 2015 and Bae
 2014 algorithms, among others.
 </p>
 
 <H2>Data Access</H2>
 <p>
 The raw data can be explored interactively with the <a href="../cgi-bin/hgTables">Table Browser</a>.
 For automated analysis, the genome annotation is stored in a bigBed file that
 can be downloaded from
 <a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/${track}/" target="_blank">our download server</a>.
 The files for this track are called <tt>crispr.bb</tt> and <tt>crisprDetails.tab</tt> and are located in the <a href="http://hgdownload.soe.ucsc.edu/gbdb/${db}/crispr/">/gbdb/${db}/crispr</a> directory of our downloads server. Individual
 regions or the whole genome annotation can be obtained using our tool <tt>bigBedToBed</tt>,
 which can be compiled from the source code or downloaded as a precompiled
 binary for your system. Instructions for downloading source code and binaries can be found
 <a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">here</a>. The tool
 can also be used to obtain only features within a given range, e.g. <tt>bigBedToBed
 http://hgdownload.soe.ucsc.edu/gbdb/hg19/${track}/crispr.bb -chrom=chr21
 -start=0 -end=10000000 stdout</tt> </p>
 
 <p>
 The file crisprDetails.tab includes the details of the off-targets. The last
 column of the bigBed file is the offset of the respective line in
 crisprDetails.tab.  E.g. if the last column is 14227033723, then the following
 command will extract the line with the corresponding off-target details:
 <tt>curl -s -r 14227033723-14227043723 http://hgdownload.soe.ucsc.edu/gbdb/hg19/crispr/crisprDetails.tab | head -n1</tt>. The off-target details can currently not be joined with the table
 browser.</p>
 
 <p>
 The file crisprDetails.tab is a tab-separated text file with two fields. The
 first field contains the numbers of off-targets for each mismatch, e.g. "0,0,1,3,49" 
 means 0 off-targets at zero mismatches, 1 at two mismatches, 3 at three and 49
 off-targets at four mismatches. The second field is a pipe-separated list of
 semicolon-separated tuples with the genome coordinates and the CFD score. E.g.
 "chr10;123376795+;42|chr5;148353274-;39" describes two off-targets, with the
 first at chr1:123376795 on the positive strand and a CFD score 0.42</p>
 
 <h2>Credits</h2>
 
 <p>
 Track created by Maximilian Haeussler and Hiram Clawson, with helpful input from Jean-Paul Concordet (MNHN Paris) and Alberto Stolfi (NYU).
 </p>
 <a name="References"></a>
 <h2>References</h2>
 
 <p>
 Haeussler M, Sch&#246;nig K, Eckert H, Eschstruth A, Miann&#233; J, Renaud JB, Schneider-Maunoury S,
 Shkumatava A, Teboul L, Kent J <em>et al</em>.
 <a href="https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1012-2"
 target="_blank">Evaluation of off-target and on-target scoring algorithms and integration into the
 guide RNA selection tool CRISPOR</a>.
 <em>Genome Biol</em>. 2016 Jul 5;17(1):148.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/27380939" target="_blank">27380939</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4934014/" target="_blank">PMC4934014</a>
 </p>
 
 <p>
 Bae S, Kweon J, Kim HS, Kim JS.
 <a href="https://www.nature.com/nmeth/journal/v11/n7/full/nmeth.3015.html" target="_blank">
 Microhomology-based choice of Cas9 nuclease target sites</a>.
 <em>Nat Methods</em>. 2014 Jul;11(7):705-6.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/24972169" target="_blank">24972169</a>
 </p>
 
 <p>
 Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C,
 Orchard R <em>et al</em>.
 <a href="https://www.nature.com/articles/nbt.3437" target="_blank">
 Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9</a>.
 <em>Nat Biotechnol</em>. 2016 Feb;34(2):184-91.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/26780180" target="_blank">26780180</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4744125/" target="_blank">PMC4744125</a>
 </p>
 
 <p>
 Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O
 <em>et al</em>.
 <a href="https://www.nature.com/nbt/journal/v31/n9/full/nbt.2647.html" target="_blank">
 DNA targeting specificity of RNA-guided Cas9 nucleases</a>.
 <em>Nat Biotechnol</em>. 2013 Sep;31(9):827-32.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23873081" target="_blank">23873081</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3969858/" target="_blank">PMC3969858</a>
 </p>
 
 <p>
 Moreno-Mateos MA, Vejnar CE, Beaudoin JD, Fernandez JP, Mis EK, Khokha MK, Giraldez AJ.
 <a href="https://www.nature.com/nmeth/journal/v12/n10/full/nmeth.3543.html" target="_blank">
 CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo</a>.
 <em>Nat Methods</em>. 2015 Oct;12(10):982-8.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/26322839" target="_blank">26322839</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4589495/" target="_blank">PMC4589495</a>
 </p>