e3d41cdb5005a8688ce803eca9b0b4fd9574a43f max Tue May 10 00:58:13 2022 -0700 fixing typo in constraint docs page, refs #29152 diff --git src/hg/makeDb/trackDb/human/constraintSuper.html src/hg/makeDb/trackDb/human/constraintSuper.html index 3c2287d..8104597 100644 --- src/hg/makeDb/trackDb/human/constraintSuper.html +++ src/hg/makeDb/trackDb/human/constraintSuper.html @@ -1,75 +1,94 @@ <h2>Description</h2> <p> The "Constraint scores" container track includes several subtracks showing the results of constraint prediction algorithms. These try to find regions of negative selection, where variations likely have functional impact. The algorithms do not use multi-species alignments to derive evolutionary constraint, but use primarily human variation, usually from variants collected by gnomAD (see the gnomAD V2 or V3 tracks on hg19 and hg38) or TOPMED (contained in our dbSNP tracks and available as a filter). Another constraint score, gnomAD constraint, is not part of this container but can be found in the hg38 gnomAD track. The algorithms covered here are: <ol> <li><b><a href="https://github.com/astrazeneca-cgr-publications/jarvis" target="_blank"> JARVIS - "Junk" Annotation genome-wide Residual Variation Intolerance Score</a></b>: First scan the entire genome with a sliding-window approach (using a 1-nucleotide step), recording the number of all TOPMED variants and common variants, irrespective of their predicted effect, within each window, to eventually calculate a single-nucleotide resolution genome-wide residual variation intolerance score (gwRVIS). Then combine gwRVIS, primary genomic sequence context, and additional genomic annotations with a multi-module deep learning framework to infer - pathogenicity of noncoding regions that still remains naïve to existing - phylogenetic conservation metrics + pathogenicity of noncoding regions that still remains naive to existing + phylogenetic conservation metrics. The higher the score, the more deleterious + is the prediction. <li><b><a href="https://www.cardiodb.org/hmc/" target="_blank"> HMC - Homologous Missense Constraint</a></b>: Homologous Missense Constraint (HMC) is a amino acid level measure of genetic intolerance of missense variants within human populations. For all assessable amino-acid positions in Pfam domains, the number of missense substitutions directly observed in gnomAD (Observed) was counted and compared to the expected value under a neutral evolution model (Expected). The upper limit of a 95% confidence interval for the Observed/Expected ratio is defined as the HMC score. Missense variants disrupting the amino-acid positions with HMC<0.8 are predicted to be likely deleterious </ol> <h2>Display Conventions and Configuration</h2> <p> Shown are the scores as a signal ("wiggle") track, with one score per genome position. Mouse over the bars to see the exact values. </p> <p> For HMC, the highly-constrained cutoff 0.8 is indicated with a line. </p> <h2>Methods</h2> + <p> -<b>HMC:</b> Scores were downloaded and converted to .bedGraph files with a custom Python script. The bedGraph files were then converted to bigWig files, as documented in our <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.txt" target=_blank>makeDoc</a> hg19 build log.<br> -<b>Jarvis:</b> Scores were downloaded and converted to a single bigWig file. See <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.txt" target=_blank>hg19 makeDoc</a> and <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/jarvis.txt" target=_blank>hg38 makeDoc</a> +<b>Jarvis:</b> Scores were downloaded and converted to a single bigWig file. +See <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.txt" target=_blank>hg19 makeDoc</a> and +<a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/jarvis.txt" target=_blank>hg38 makeDoc</a>. -</p> +<br> + +<b>HMC:</b> Scores were downloaded and converted to .bedGraph files with a +custom Python script. The bedGraph files were then converted to bigWig files, +as documented in our <a +href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.txt" +target=_blank>makeDoc</a> hg19 build log. </p> <h2>Credits</h2> <p> Thanks to Jean-Madeleine Desainteagathe (APHP Paris, France) for suggesting the Jarvis, MTR, HMC tracks. Thanks to Xialei Zhang for providing the HMC data file and to Dimitrios Vitsios for helping clean up the hg38 Jarvis files. </p> <h2>References</h2> <p> Xiaolei Zhang, Pantazis I. Theotokis, Nicholas Li, the SHaRe Investigators, Caroline F. Wright, Kaitlin E. Samocha, Nicola Whiffin, James S. Ware <a href="https://doi.org/10.1101/2022.02.16.22271023" target="_blank"> Genetic constraint at single amino acid resolution improves missense variant prioritisation and gene discovery</a>. <em>Medrxiv</em> 2022.02.16.22271023 </p> +<p> +Vitsios D, Dhindsa RS, Middleton L, Gussow AB, Petrovski S. +<a href="https://www.ncbi.nlm.nih.gov/pubmed/33686085" target="_blank"> + Prioritizing non-coding regions based on human genomic constraint and sequence context with deep + learning</a>. +<em>Nat Commun</em>. 2021 Mar 8;12(1):1504. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33686085" target="_blank">33686085</a>; PMC: <a + href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7940646/" target="_blank">PMC7940646</a> +</p> +