f350ebff8f2cc1e0772032e59e926b5e45b374cd lrnassar Tue May 5 16:18:28 2026 -0700 Adding ClinPred missense pathogenicity score track on hg19 and hg38. refs #37510 ClinPred (Alirezaie et al, AJHG 2018) joins the predictionScoresSuper supertrack as a composite of four bigWigs, one per alternate base, with a per-position color overlay (red for score >= 0.5 likely pathogenic, blue for < 0.5 likely benign). Adds clinPredToWig.py to convert the upstream score table to wig, a clinPred branch in makeWigColorByRevelCadd.py for the color overlay step, and reciprocal relatedTracks entries to REVEL, CADD, PrimateAI-3D, and AlphaMissense. Also adds Display Conventions and Credits entries in predictionScoresSuper.html for ClinPred, PrimateAI-3D, and PromoterAI. diff --git src/hg/makeDb/trackDb/human/clinPred.html src/hg/makeDb/trackDb/human/clinPred.html new file mode 100644 index 00000000000..1748d7654b5 --- /dev/null +++ src/hg/makeDb/trackDb/human/clinPred.html @@ -0,0 +1,126 @@ +<h2>Description</h2> + +<p> +This track collection shows +<a href="https://sites.google.com/site/clinpred/" target="_blank">ClinPred</a> +scores, a machine-learning predictor of pathogenicity for nonsynonymous (missense) +single-nucleotide variants. ClinPred combines existing pathogenicity scores with +population allele frequency from gnomAD, and was trained on confidently annotated +disease-causing and benign variants from ClinVar. Pre-computed scores are +provided for all possible human missense variants in the exome. +</p> + +<p> +Scores range from 0 to 1, with higher values indicating greater predicted +likelihood that a variant is disease-relevant. The authors recommend a score of +≥ 0.5 as evidence of pathogenicity. As with any pathogenicity prediction +score, ClinPred is intended as supporting evidence rather than a stand-alone +classifier. +</p> + +<h2>Display Conventions and Configuration</h2> + +<p> +There are four subtracks in this collection, one for each possible alternate +nucleotide. At every exome position covered by ClinPred, three of the four +subtracks show a score (one per non-reference base) and the fourth, corresponding +to the reference base, is set to 0. Synonymous alternates — those that do +not change the encoded amino acid — are also set to 0, since ClinPred only +scores missense variants. Positions with no exome coverage are shown as gaps. +</p> + +<p> +When using this track, zoom in until you can see every basepair at the top of +the display. Otherwise, several nucleotides fall under each pixel and no score +will be shown on the mouseover tooltip. +</p> + +<p><b>Track colors</b></p> + +<p> +Each subtrack is colored by score using the threshold recommended by the +ClinPred authors: +</p> + +<table style="text-align: left;"> + <thead> + <tr> + <th>Range</th> + <th>Classification</th> + </tr> + </thead> + <tbody> + <tr> + <td>≥ 0.5</td> + <td style="color: rgb(255,0,0);">Likely pathogenic</td> + </tr> + <tr> + <td>< 0.5</td> + <td style="color: rgb(80,166,230);">Likely benign</td> + </tr> + </tbody> +</table> + +<h2>Data Access</h2> + +<p> +ClinPred scores are available at the +<a href="https://sites.google.com/site/clinpred/" target="_blank">ClinPred +website</a>, which provides pre-computed scores for all possible human missense +variants. +</p> + +<p> +The ClinPred data on the UCSC Genome Browser can be explored interactively with +the <a href="../cgi-bin/hgTables">Table Browser</a> or the +<a href="../cgi-bin/hgIntegrator">Data Integrator</a>. For automated download +and analysis, the data are stored in bigWig files and can be downloaded from +<a href="http://hgdownload.soe.ucsc.edu/gbdb/$db/clinPred/" target="_blank">our +download server</a>. The files are named <tt>a.bw, c.bw, g.bw, t.bw</tt>. +Individual regions can be obtained using <tt>bigWigToBedGraph</tt>, which can +be compiled from source or downloaded as a precompiled binary; instructions +are <a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads" target="_blank">here</a>. +For example: +<br> <br> +<tt>bigWigToBedGraph -chrom=chr1 -start=100000 -end=100500 http://hgdownload.soe.ucsc.edu/gbdb/$db/clinPred/a.bw stdout</tt> +</p> + +<h2>Methods</h2> + +<p> +Data were downloaded from the +<a href="https://sites.google.com/site/clinpred/" target="_blank">ClinPred +website</a> on May 5, 2026, and converted into four per-alternate-base bigWig +files using a custom script. As with all other tracks, a full log of the +commands used for the conversion is available in our +<a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/" target="_blank">source repository</a>, for +<a href="https://raw.githubusercontent.com/ucscGenomeBrowser/kent/master/src/hg/makeDb/doc/hg19.txt" target="_blank">hg19</a> and +<a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/clinPred.txt" target="_blank">hg38</a>. +</p> + +<h2>Data Use</h2> + +<p> +ClinPred scores are freely available for non-commercial applications. For +commercial use, please see the licensing information on the +<a href="https://sites.google.com/site/clinpred/" target="_blank">ClinPred +website</a>. +</p> + +<h2>Credits</h2> + +<p> +Thanks to the ClinPred authors for making the pre-computed scores available +through their website. +</p> + +<h2>References</h2> + +<p> +Alirezaie N, Kernohan KD, Hartley T, Majewski J, Hocking TD. +<a href="https://linkinghub.elsevier.com/retrieve/pii/S0002-9297(18)30271-4" target="_blank"> +ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants</a>. +<em>Am J Hum Genet</em>. 2018 Oct 4;103(4):474-483. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/30220433" target="_blank">30220433</a>; PMC: <a +href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6174354/" target="_blank">PMC6174354</a> +</p>