6e61d3349b36cbcc01500c1483cc7bfbc141d9ea lrnassar Wed Apr 22 13:47:33 2026 -0700 PrimateAI-3D: tighten 0.821 threshold wording per the paper. refs #37274 Confirmed against Gao 2023 (PMC10713091): the calibration cohort is the Deciphering Developmental Disorders (DDD) neurodevelopmental cohort, not ClinVar. The cutoff was chosen so that the count of pathogenic calls (n=7,238) matched the excess of de novo missense mutations above the trinucleotide background expectation in that cohort. diff --git src/hg/makeDb/trackDb/human/primateAi.html src/hg/makeDb/trackDb/human/primateAi.html index 7f4ea570c65..67f715c352b 100644 --- src/hg/makeDb/trackDb/human/primateAi.html +++ src/hg/makeDb/trackDb/human/primateAi.html @@ -25,35 +25,37 @@ each item is labeled by default with its <b>nucleotide change</b> (e.g. <code>C>T</code>) rather than its amino acid change. The label can be switched to the amino acid change via the "Label fields" control in the Track Settings. </p> <p> Hovering over a variant shows: </p> <ul> <li><b>Var</b> — the nucleotide substitution on the + strand (reference > alternate)</li> <li><b>AA</b> — the resulting amino acid change (single-letter reference > alternate)</li> <li><b>Score</b> — the raw PrimateAI-3D pathogenicity score (0–1). The authors suggest a clinical threshold of <b>0.821</b> for - distinguishing pathogenic from benign missense variants. This - threshold was calibrated against a subset of annotated mutations - in Gao et al. 2023 (Fig. 5A), chosen so that the number of - PrimateAI-3D pathogenic calls matched the observed excess of de - novo missense mutations in a clinical cohort (n = 7,238).</li> + distinguishing pathogenic from benign missense variants. In Gao + et al. 2023 (Fig. 5A) this threshold was derived from the + Deciphering Developmental Disorders (DDD) neurodevelopmental + cohort: the cutoff was chosen so that the number of variants + scored as pathogenic (n = 7,238) matched the observed + excess of de novo missense mutations above the trinucleotide + background expectation in that cohort.</li> <li><b>Perc</b> — the percentile rank of the raw score across all scored variants (0–1). The track score field (0–1000) is this value scaled by 1000.</li> <li><b>Pred</b> — Illumina's binary call: <span style="color:#0000c8">benign</span> or <span style="color:#c80000">pathogenic</span>, as provided in the source file. About 75% of variants in the track are benign and 25% pathogenic. Note that this call is <em>not</em> a simple application of the 0.821 raw-score threshold — some variants with raw scores below 0.821 are labeled pathogenic and vice versa.</li> </ul> <p> Items can be filtered by prediction (benign/pathogenic), by raw PrimateAI-3D score, or by percentile.