d498335adfefc70cbf9673f489688bd34503d521 max Thu May 5 09:50:40 2022 -0700 more polishing up of the constraint scores super track, refs #29153 and refs #29154 diff --git src/hg/makeDb/trackDb/human/constraintSuper.html src/hg/makeDb/trackDb/human/constraintSuper.html index 7f62572..eb1af9f 100644 --- src/hg/makeDb/trackDb/human/constraintSuper.html +++ src/hg/makeDb/trackDb/human/constraintSuper.html @@ -1,57 +1,61 @@ <h2>Description</h2> <p> -This container track includes various subtracks showing the results of -constraint prediction algorithms that try to find regions of negative -selection, where variations likely have functional impact. These algorithms do -not use multi-species alignments to derive constraint, but use primarily human -variation, usually from variants collected by gnomAD (see the gnomAD V2 or V3 -tracks on hg19 and hg38) -or TOPMED (contained in our dbSNP tracks and available as a filter). +The "Constraint scores" container track includes several subtracks showing the results of +constraint prediction algorithms. These try to find regions of negative +selection, where variations likely have functional impact. The algorithms do +not use multi-species alignments to derive evolutionary constraint, but use +primarily human variation, usually from variants collected by gnomAD (see the +gnomAD V2 or V3 tracks on hg19 and hg38) or TOPMED (contained in our dbSNP +tracks and available as a filter). Another constraint score, gnomAD +constraint, is not part of this container but can be found in the hg38 gnomAD +track. The algorithms covered here are: <ol> <li><b><a href="https://www.cardiodb.org/hmc/" target="_blank"> HMC - Homologous Missense Constraint</a></b>: For all possible missense substitutions in PFAM domains, the number of substitutions directly observed in gnomAD was counted and compared to the expected value under a neutral evolution model. Homologous Residue Constraint is defined as the upper limit of a 95% confidence interval for the Observed/Expected ratio. <li><b><a href="https://github.com/astrazeneca-cgr-publications/jarvis" target="_blank"> JARVIS - "Junk" Annotation genome-wide Residual Variation Intolerance Score</a></b>: First scan the entire genome with a sliding-window approach (using a 1-nucleotide step), recording the number of all TOPMED variants and common variants, irrespective of their predicted effect, within each window, to eventually calculate a single-nucleotide resolution genome-wide residual variation intolerance score (gwRVIS). Then combine gwRVIS, primary genomic sequence context, and additional genomic annotations with a multi-module deep learning framework to infer pathogenicity of noncoding regions that still remains naïve to existing phylogenetic conservation metrics </ol> <h2>Display Conventions and Configuration</h2> <h2>Methods</h2> <p> -<b>HMC:</b> +<b>HMC:</b> Scores were downloaded and converted to .bedGraph files with a custom Python script. The bedGraph files were then converted to bigWig files, as documented in our <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.txt" target=_blank>makeDoc</a> hg19 build log.<br> +<b>Jarvis:</b> Scores were downloaded and converted to a single bigWig file. See <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg19.txt" target=_blank>hg19 makeDoc</a> and <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/jarvis.txt" target=_blank>hg38 makeDoc</a> + </p> <h2>Credits</h2> <p> -Thanks to Jean-Madeleine Desainteagathe (APHP Paris, France) for suggesting the HMC track and to Xialei Zhang for providing the HMC data file. +Thanks to Jean-Madeleine Desainteagathe (APHP Paris, France) for suggesting the Jarvis, MTR, HMC tracks. Thanks to Xialei Zhang for providing the HMC data file and to Dimitrios Vitsios for helping clean up the hg38 Jarvis files. </p> <h2>References</h2> <p> Xiaolei Zhang, Pantazis I. Theotokis, Nicholas Li, the SHaRe Investigators, Caroline F. Wright, Kaitlin E. Samocha, Nicola Whiffin, James S. Ware <a href="https://doi.org/10.1101/2022.02.16.22271023" target="_blank"> Genetic constraint at single amino acid resolution improves missense variant prioritisation and gene discovery</a>. <em>Medrxiv</em> 2022.02.16.22271023 </p>