3a884d87e2f5198d79e8c628fad94112ed4d1a01 max Mon Feb 15 05:48:31 2021 -0800 updating CADD track docs, refs #18492 diff --git src/hg/makeDb/trackDb/human/caddSuper.html src/hg/makeDb/trackDb/human/caddSuper.html new file mode 100644 index 0000000..68802e1 --- /dev/null +++ src/hg/makeDb/trackDb/human/caddSuper.html @@ -0,0 +1,110 @@ +
This track collection shows Combined Annotation Dependent Depletion scores. +CADD is a tool for scoring the deleteriousness of single nucleotide variants as +well as insertion/deletions variants in the human genome.
+ ++Some mutation annotations +tend to exploit a single information type (e.g. phastCons or phylP for +conservation) and/or are restricted in scope (e.g. to missense changes). Thus, +a broadly applicable metric that objectively weights and integrates diverse +information is needed. Combined Annotation Dependent Depletion (CADD) is a +framework that integrates multiple annotations into one metric by contrasting +variants that survived natural selection with simulated mutations. +
+ ++CADD scores strongly correlate with allelic diversity, pathogenicity of both +coding and non-coding variants, and experimentally measured regulatory effects, +and also highly rank causal variants within individual genome sequences. +Finally, CADD scores of complex trait-associated variants from genome-wide +association studies (GWAS) are significantly higher than matched controls and +correlate with study sample size, likely reflecting the increased accuracy of +larger GWAS. +
+ ++There are six subtracks of this track: four for every possible single nucleotide mutation, +one for insertions and one for deletions. All subtracks show the CADD Phred +score on mouse over.
+ +
+Single nucleotide variants (SNV): For SNVs, at every
+genome position, there are three values per position, one for every possible
+nucleotide mutation. The fourth value, "no mutation", e.g. A to A, is always
+set to zero.
+When using this track, please zoom in until you can see every basepair at the
+top of the display. Otherwise, there are several nucleotides under your mouse
+cursor per pixel and instead of an actual score, the tooltip text can only show
+the average score of all nucleotides under the cursor, which is indicated by
+the prefix "~" in the mouse over and averages of scores are not useful for any
+application of CADD.
+
Insertions and deletions:: Scores are also shown on mouse over for a +set of insertions and deletions. On hg38, the set has been obtained from +Gnomad3. On hg19, the set of indels has been obtained from various sources +(gnomAD2, ExAC, 1000 Genomes, ESP). If your insertion or deleletion of interest +is not in the track, you will need to use CADD's +Online scoring tool +to obtain them.
+ ++CADD scores are freely available for all non-commercial applications from the CADD website. For commercial applications, see the license instructions there. +
+ +
+The CADD data on the UCSC Genome Browser can be explored interactively with the
+Table Browser or the Data Integrator.
+For automated download and analysis, the genome annotation is stored at UCSC in bigWig and bigBed files that
+can be downloaded from
+our download server.
+The files for this track are called a.bw, c.bw, g.bw, t.bw, ins.bb and del.bb. Individual
+regions or the whole genome annotation can be obtained using our tool bigWigToWig
+or bigBedToBed which can be compiled from the source code or downloaded as a precompiled
+binary for your system. Instructions for downloading source code and binaries can be found
+here.
+The tools
+can also be used to obtain only features within a given range, e.g.
+bigWigToBedGraph -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/a.bw stdout
+or
+bigBedToBed -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/ins.bb stdout
+Data were converted from the files provided on +the CADD Downloads website, provided by the Kircher lab, +using custom Python scripts, +documented in our makeDoc files. +
+ ++Thanks to the CADD development team for providing precomputed data as simple tab-separated files. +
+ ++Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. + + A general framework for estimating the relative pathogenicity of human genetic variants. +Nat Genet. 2014 Mar;46(3):310-5. +PMID: 24487276; PMC: PMC3992975 +
+ ++Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. + + CADD: predicting the deleteriousness of variants throughout the human genome. +Nucleic Acids Res. 2019 Jan 8;47(D1):D886-D894. +PMID: 30371827; PMC: PMC6323892 +
+