6c567fd9a03e87610681a43d2183ebb43547d1ad lrnassar Fri Apr 24 17:58:57 2026 -0700 PromoterAI: review followups. refs #37278 Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the underscore-prefix exclusion rule for hgdownload sync (same pattern as PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated. Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track default of 20 is too cramped for a signed -1..+1 score. Description page: drop the wrong primateai3d.basespace.illumina.com link in Data Access; PromoterAI is not on BaseSpace, it's distributed via the license agreement on the GitHub page (a download link is emailed after submission). Reword Data Access and Methods accordingly. Description page: add Illumina's recommended interpretation thresholds (|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with a note that higher cutoffs select smaller, higher-confidence sets. diff --git src/hg/makeDb/trackDb/human/promoterAi.html src/hg/makeDb/trackDb/human/promoterAi.html index 7403dadf942..fbf16e59af1 100644 --- src/hg/makeDb/trackDb/human/promoterAi.html +++ src/hg/makeDb/trackDb/human/promoterAi.html @@ -4,30 +4,38 @@ target="_blank">PromoterAI is a deep neural network from Illumina that predicts the expression-altering impact of single nucleotide variants in gene promoter regions. It scores all possible substitutions within 500 bp of annotated transcription start sites (TSS), covering approximately 39.5 million genomic positions across all protein-coding genes.
Scores range from -1 to 1. A negative score is a predicted decrease in expression of the target gene; a positive score is a predicted increase in expression. Scores near zero indicate the variant is predicted to leave expression unchanged. Variants at either end of the range (large |score|) are dysregulating and are the ones enriched among patients with rare disease in the PromoterAI paper.
++Illumina's PromoterAI +GitHub page recommends three tiered thresholds for interpretation: +|score| ≥ 0.1, |score| ≥ 0.2, and |score| ≥ 0.5. +Higher absolute thresholds select progressively smaller, higher-confidence sets of +predicted expression-altering variants. +
+This track is a composite with four bigWig subtracks, one for each possible alternate allele (A, C, G, T). When zoomed in, the exact PromoterAI score for each possible mutation is shown on mouseover. At wider zooms multiple data points fall into a single pixel and averaging scores is not biologically meaningful, so the mouseover displays "zoom in to see values" until you zoom in far enough that individual values can be shown.
A fifth subtrack ("PromoterAI overlaps") shows positions where overlapping transcripts produce different scores for the same variant. At these positions, the bigWig subtracks show the score with the largest absolute value, while the overlap track lists every per-transcript score. About 3.8% of variant positions have @@ -37,41 +45,41 @@ off on the track configuration page.
Across all subtracks, coloring follows the direction of the predicted effect: red (bars above the zero line in the bigWigs, or filled boxes in the overlap subtrack) indicates predicted over-expression (positive score), and blue (bars below zero or filled boxes) indicates predicted under-expression (negative score).
The PromoterAI predictions are distributed by Illumina under a license that does not permit redistribution, so this track is not available for bulk download from UCSC and -is excluded from the Table Browser and public API. The original prediction files can -be obtained directly from Illumina via the license request on the +is excluded from the Table Browser and public API. The original prediction files are +available for academic and non-commercial research use directly from Illumina: +complete the license agreement linked from the PromoterAI GitHub -page, which links to the -Illumina -BaseSpace download. +page, and a download link is emailed after submission.
-The PromoterAI hg38 TSS-500 file was downloaded from Illumina BaseSpace. The file +The PromoterAI hg38 TSS-500 file was downloaded from Illumina via the PromoterAI +license agreement. The file contains pre-computed scores for all possible single nucleotide substitutions within 500 bp of annotated TSS positions. For positions covered by multiple transcripts, the score with the largest absolute value was used for the bigWig tracks. Positions where transcripts produced different scores (4.45M of 118.6M unique variants, 3.8%) were additionally written to a bigBed overlap track with per-transcript detail (transcript IDs, per-transcript scores, strand, and the maximum pairwise score difference). The conversion script is available from our Github.
Thanks to Kishore Jaganathan and colleagues at Illumina for making the PromoterAI predictions publicly available for academic and non-commercial research.