2065b8b9a610a9fe59fdb593aabe88c76962c168 jnavarr5 Thu Feb 25 14:30:45 2021 -0800 Replacing double quotes with the HTML entity, refs #18492 diff --git src/hg/makeDb/trackDb/human/caddSuper.html src/hg/makeDb/trackDb/human/caddSuper.html index d159e19..cbc05cc 100644 --- src/hg/makeDb/trackDb/human/caddSuper.html +++ src/hg/makeDb/trackDb/human/caddSuper.html @@ -1,110 +1,110 @@

Description

This track collection shows Combined Annotation Dependent Depletion scores. CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome.

Some mutation annotations tend to exploit a single information type (e.g. phastCons or phylP for conservation) and/or are restricted in scope (e.g. to missense changes). Thus, a broadly applicable metric that objectively weights and integrates diverse information is needed. Combined Annotation Dependent Depletion (CADD) is a framework that integrates multiple annotations into one metric by contrasting variants that survived natural selection with simulated mutations.

CADD scores strongly correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured regulatory effects, and also highly rank causal variants within individual genome sequences. Finally, CADD scores of complex trait-associated variants from genome-wide association studies (GWAS) are significantly higher than matched controls and correlate with study sample size, likely reflecting the increased accuracy of larger GWAS.

Display Conventions and Configuration

There are six subtracks of this track: four for every possible single nucleotide mutation, one for insertions and one for deletions. All subtracks show the CADD Phred score on mouseover.

Single nucleotide variants (SNV): For SNVs, at every genome position, there are three values per position, one for every possible -nucleotide mutation. The fourth value, "no mutation", e.g. A to A, is always +nucleotide mutation. The fourth value, "no mutation", e.g. A to A, is always set to zero.
When using this track, please zoom in until you can see every basepair at the top of the display. Otherwise, there are several nucleotides under your mouse cursor per pixel and instead of an actual score, the tooltip text can only show the average score of all nucleotides under the cursor, which is indicated by -the prefix "~" in the mouseover and averages of scores are not useful for any +the prefix "~" in the mouseover and averages of scores are not useful for any application of CADD.

Insertions and deletions:: Scores are also shown on mouseover for a set of insertions and deletions. On hg38, the set has been obtained from Gnomad3. On hg19, the set of indels has been obtained from various sources (gnomAD2, ExAC, 1000 Genomes, ESP). If your insertion or deleletion of interest is not in the track, you will need to use CADD's Online scoring tool to obtain them.

Data access

CADD scores are freely available for all non-commercial applications from the CADD website. For commercial applications, see the license instructions there.

The CADD data on the UCSC Genome Browser can be explored interactively with the Table Browser or the Data Integrator. For automated download and analysis, the genome annotation is stored at UCSC in bigWig and bigBed files that can be downloaded from our download server. The files for this track are called a.bw, c.bw, g.bw, t.bw, ins.bb and del.bb. Individual regions or the whole genome annotation can be obtained using our tool bigWigToWig or bigBedToBed which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tools can also be used to obtain only features within a given range, e.g.
bigWigToBedGraph -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/a.bw stdout
or
bigBedToBed -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/cadd/ins.bb stdout

Methods

Data were converted from the files provided on the CADD Downloads website, provided by the Kircher lab, using custom Python scripts, documented in our makeDoc files.

Credits

Thanks to the CADD development team for providing precomputed data as simple tab-separated files.

References

Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014 Mar;46(3):310-5. PMID: 24487276; PMC: PMC3992975

Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019 Jan 8;47(D1):D886-D894. PMID: 30371827; PMC: PMC6323892