3a884d87e2f5198d79e8c628fad94112ed4d1a01 max Mon Feb 15 05:48:31 2021 -0800 updating CADD track docs, refs #18492 diff --git src/hg/makeDb/trackDb/human/cadd.html src/hg/makeDb/trackDb/human/cadd.html deleted file mode 100644 index 68802e1..0000000 --- src/hg/makeDb/trackDb/human/cadd.html +++ /dev/null @@ -1,110 +0,0 @@ -<h2>Description</h2> - -<p> This track collection shows <a href="https://cadd.gs.washington.edu/" -target="_blank">Combined Annotation Dependent Depletion</a> scores. -CADD is a tool for scoring the deleteriousness of single nucleotide variants as -well as insertion/deletions variants in the human genome.</p> - -<p> -Some mutation annotations -tend to exploit a single information type (e.g. phastCons or phylP for -conservation) and/or are restricted in scope (e.g. to missense changes). Thus, -a broadly applicable metric that objectively weights and integrates diverse -information is needed. Combined Annotation Dependent Depletion (CADD) is a -framework that integrates multiple annotations into one metric by contrasting -variants that survived natural selection with simulated mutations. -</p> - -<p> -CADD scores strongly correlate with allelic diversity, pathogenicity of both -coding and non-coding variants, and experimentally measured regulatory effects, -and also highly rank causal variants within individual genome sequences. -Finally, CADD scores of complex trait-associated variants from genome-wide -association studies (GWAS) are significantly higher than matched controls and -correlate with study sample size, likely reflecting the increased accuracy of -larger GWAS. -</p> - -<h2>Display Conventions and Configuration</h2> -<p> -There are six subtracks of this track: four for every possible single nucleotide mutation, -one for insertions and one for deletions. All subtracks show the CADD Phred -score on mouse over.<p> - -<p> -<b>Single nucleotide variants (SNV):</b> For SNVs, at every -genome position, there are three values per position, one for every possible -nucleotide mutation. The fourth value, "no mutation", e.g. A to A, is always -set to zero.<br> -When using this track, please zoom in until you can see every basepair at the -top of the display. Otherwise, there are several nucleotides under your mouse -cursor per pixel and instead of an actual score, the tooltip text can only show -the average score of all nucleotides under the cursor, which is indicated by -the prefix "~" in the mouse over and averages of scores are not useful for any -application of CADD. -</p> - -<p><b>Insertions and deletions:</b>: Scores are also shown on mouse over for a -set of insertions and deletions. On hg38, the set has been obtained from -Gnomad3. On hg19, the set of indels has been obtained from various sources -(gnomAD2, ExAC, 1000 Genomes, ESP). If your insertion or deleletion of interest -is not in the track, you will need to use CADD's -<a target=_blank href="https://cadd.gs.washington.edu/score">Online scoring tool</a> -to obtain them.</p> - -<H2>Data access</H2> -<p> -CADD scores are freely available for all non-commercial applications from <a target=_blank href="https://cadd.gs.washington.edu/download">the CADD website</a>. For commercial applications, see <a target=_blank href="https://cadd.gs.washington.edu/contact">the license instructions</a> there. -</p> - -<p> -The CADD data on the UCSC Genome Browser can be explored interactively with the -<a href="../cgi-bin/hgTables">Table Browser</a> or the <a -href="../cgi-bin/hgIntegrator">Data Integrator</a>. -For automated download and analysis, the genome annotation is stored at UCSC in bigWig and bigBed files that -can be downloaded from -<a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/" target="_blank">our download server</a>. -The files for this track are called <tt>a.bw, c.bw, g.bw, t.bw, ins.bb and del.bb</tt>. Individual -regions or the whole genome annotation can be obtained using our tool <tt>bigWigToWig</tt> -or <tt>bigBedToBed</tt> which can be compiled from the source code or downloaded as a precompiled -binary for your system. Instructions for downloading source code and binaries can be found -<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">here</a>. -The tools -can also be used to obtain only features within a given range, e.g. <br> -<tt>bigWigToBedGraph -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/a.bw stdout</tt><br> -or<br> -<tt>bigBedToBed -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/ins.bb stdout</tt></p> - -<h2>Methods</h2> - -<p> -Data were converted from the files provided on -<a href="https://cadd.gs.washington.edu/download" target="_blank">the CADD Downloads website</a>, provided by the Kircher lab, -using <a href="https://github.com/ucscGenomeBrowser/kent/tree/master/src/hg/makeDb/cadd" target=_BLANK>custom Python scripts</a>, -documented in our <a target=_BLANK href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/cadd.txt">makeDoc</a> files. -</p> - -<h2>Credits</h2> -<p> -Thanks to the CADD development team for providing precomputed data as simple tab-separated files. -</p> - -<h2>References</h2> -<p> -Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. -<a href="http://dx.doi.org/10.1038/ng.2892" target="_blank"> - A general framework for estimating the relative pathogenicity of human genetic variants</a>. -<em>Nat Genet</em>. 2014 Mar;46(3):310-5. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/24487276" target="_blank">24487276</a>; PMC: <a - href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3992975/" target="_blank">PMC3992975</a> -</p> - -<p> -Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. -<a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gky1016" target="_blank"> - CADD: predicting the deleteriousness of variants throughout the human genome</a>. -<em>Nucleic Acids Res</em>. 2019 Jan 8;47(D1):D886-D894. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/30371827" target="_blank">30371827</a>; PMC: <a - href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323892/" target="_blank">PMC6323892</a> -</p> -