50466766840ded6cb8bd5cb868bdf2ff3f613bc0
lrnassar
  Tue Apr 21 11:17:15 2026 -0700
QA fixes for PrimateAI-3D track.

Config (primateAi.ra):
- Fix broken Ensembl transcript linkout: urls $S expanded to chromosome
name; switch to the Ensembl transcript page with $$
- Add numeric filters on percentile and raw score (label notes the
paper's 0.821 clinical threshold)
- Add maxWindowToDraw 2000000

Data (primateAiToBigBed.py):
- Change hardcoded strand '+' to '.': the source file has no strand
column
- Accept input/output paths as CLI args (previously hardcoded the hg38
input path)
- Handle variable field count: ~2.4M rows in the hg19 source are
missing the refseq column

Description (primateAi.html):
- Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way
track
- Regenerate the first reference via getTrackReferences (wrong article
number and wrong PMC ID in the previous text)
- Fix the GitHub URL for the conversion script in Methods
- Move the Zoonomia 447-way mention out of Description; rephrase the
license note to describe precisely what is disabled

relatedTracks.ra:
- Add reciprocal cross-links for primateAi <-> alphaMissense (hg38),
primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi
(hg38). Also includes promoterAi <-> alphaMissense cross-links.

refs #37274 #37279

diff --git src/hg/makeDb/trackDb/relatedTracks.ra src/hg/makeDb/trackDb/relatedTracks.ra
index 4930e91da8d..f0b122a368e 100644
--- src/hg/makeDb/trackDb/relatedTracks.ra
+++ src/hg/makeDb/trackDb/relatedTracks.ra
@@ -1,105 +1,120 @@
 # A space delimited file of track relatedness. All entries must be reciprocal. Format:
 # ucscDb track trackLinkingTo reason
 
 # hg38:
 hg38 knownGene knownGeneArchive View previous versions of GENCODE Genes
 hg38 knownGeneArchive knownGene View the latest GENCODE Genes version
 
 hg38 miRnaAtlas nonCodingRNAs View associated precursor miRnas
 hg38 nonCodingRNAs miRnaAtlas View expression of cleaved miRnas
 
 hg38 caddSuper gnomad View associated variants
 hg38 gnomad caddSuper View CADD scores for this variant and region
 
 hg38 constraintSuper gnomadPLI Predicted constraint metrics from gnomAD
 hg38 gnomadPLI constraintSuper Container track of various constraint scores
 
 hg38 gnomadStr strVar A collection of population-level STR variation tracks across the genome
 hg38 strVar gnomadStr Population-level STR variation across disease-associated loci from gnomAD v3.1.3
 
 hg38 revel liftHg38 Revel is based on hg19 and lifted to hg38. liftOver "chain" alignment from hg19 to hg38
 hg38 liftHg38 revel Revel scores were lifted using UCSC liftOver chains from hg38
 
 hg38 revel caddSuper CADD, a similar deleteriousness score, and not used as an input by REVEL
 hg38 caddSuper revel REVEL, a similar deleteriousness score
 
 hg38 liftHg19 grcIncidentDb GRC Incident database, to explore reasons why the assembly was changed
 hg38 grcIncidentDb liftHg19 LiftOver for hg38, explores how incident regions aligned between human assemblies
 
 hg38 ReMap liftHg19 NCBI ReMap, even though it has the same name, is a liftOver-like hg19/hg38 alignment, and unrelated to the ReMap database
 hg38 liftHg19 ReMap ReMap, even though it has the same name, is a database of transcription factor binding sites, unrelated to NCBI ReMap
 
 hg38 ReMap jaspar JASPAR is a database of predicted TF binding sites, based on short DNA matches. Unlike ReMap, the data is purely computational.
 
 hg38 jaspar ReMap ReMap is a database of TF binding sites inferred from ChIP-Seq Data. Unlike JASPAR predictions, these sites are supported by functional assay
 
 hg38 problematic mappability The mappability track contains regions where short sequencing reads are hard to align
 hg38 mappability problematic The problematic regions track contains various gene clusters and the ENCODE blacklist
 hg38 problematic grcIncidentDb The GRC (Genome Reference Consortium) incidents track contains regions that were flagged by the group that puts together the genome 
 hg38 grcIncidentDb problematic The problematic regions track lists unusual regions and the ones that often lead to artefacts when aligning reads to the reference genome
 
 hg38 phasedVars varFreqs The variant frequencies track contains projects where variant frequencies, aka allele frequencies, are publicly available.
 hg38 varFreqs phasedVars The phased variants track contains projects that provide haplotype-phased genotypes/variants.
 
 hg38 wgEncodeReg4 wgEncodeReg Previous ENCODE3 Regulation track
 hg38 wgEncodeReg wgEncodeReg4 New ENCODE4 Regulation track
 hg38 wgEncodeReg4 cCREs Related ENCODE4 cCRE annotations
 hg38 cCREs wgEncodeReg4 Related ENCODE4 regulation data
 
 hg38 avada varaico The AVADA track is no longer updated. See VARAICO for the latest variants mined from papers.
 hg38 varaico avada Previous literature mining track for variants extracted from publications. No longer updated.
 
 # hg19:
 hg19 caddSuper gnomad View associated variants
 hg19 gnomad caddSuper View CADD scores for this variant and region
 
 hg19 decipherHaploIns gnomadPLI Compare haploinsufficiency metrics as defined by gnomAD
 hg19 gnomadPLI decipherHaploIns Compare constraint metrics as defined by DECIPHER
 
 hg19 revel caddSuper CADD, a similar deleteriousness score, and not used as an input by REVEL
 hg19 caddSuper revel REVEL, a similar deleteriousness score
 
 hg19 liftHg38 grcIncidentDb GRC Incident database, to explore reasons why the assembly was changed
 hg19 grcIncidentDb liftHg38 LiftOver alignments between hg38 and hg38 to explore how the GRC incident assembly changes affect whole-genome alignments between hg19 and hg38 used for lifting data from hg19
 
 hg19 fixSeqLiftOverPsl liftHg38 Investigate how patches affect the whole-genome alignment used for liftOver
 hg19 liftHg38 fixSeqLiftOverPsl Investigate how assembly patches affect the liftOver alignment
 
 hg19 liftHg38 hg38ContigDiff Hg38 Diff shows contigs that were changed from hg19 to hg38
 hg19 hg38ContigDiff liftHg38 Investigate how contig changes affect the liftOver alignments
 
 hg19 jaspar ReMap ReMap is a database of TF binding sites inferred from ChIP-Seq Data. Unlike JASPAR predictions, these sites are supported by functional assay
 hg19 ReMap jaspar JASPAR is a database of predicted TF binding sites, based on short DNA matches. Unlike ReMap, the data is purely computational.
 
 hg19 ReMap liftHg38 NCBI ReMap, even though it has the same name, is a liftOver-like hg19/hg38 alignment, and unrelated to the ReMap database
 hg19 liftHg38 ReMap ReMap, even though it has the same name, is a database of transcription factor binding sites, unrelated to NCBI ReMap
 
 hg19 refSeqComposite pseudoYale60 NCBI RefSeq Curated and RefSeq Other contains pseudogenes, but the Yale annotation should be more comprehensive for this transcript type
 hg19 pseudoYale60 refSeqComposite NCBI RefSeq Curated and RefSeq Other also contain some transcribed and untranscribed pseudogenes, respectively.
 
 hg19 constraintSuper gnomadPLI Predicted constraint metrics from gnomAD
 hg19 gnomadPLI constraintSuper Container track of various constraint scores
 
 hg19 avada varaico The AVADA track is no longer updated. See VARAICO for the latest variants mined from papers.
 hg19 varaico avada Previous literature mining track for variants extracted from publications. No longer updated.
 
 # mm39:
 
 mm39 knownGene knownGeneArchive View previous versions of GENCODE Genes
 mm39 knownGeneArchive knownGene View the latest GENCODE Genes version
 
 # mm10 ENCODE4 Regulation:
 mm10 encode4Reg encode3Reg Previous ENCODE3 Regulation track
 mm10 encode3Reg encode4Reg New ENCODE4 Regulation track
 mm10 encode4Reg cCREs Related ENCODE4 cCRE annotations
 mm10 cCREs encode4Reg Related ENCODE4 regulation data
 
 # hg38 long-read SV supertrack cross-links to other SV resources:
 hg38 lrSv gnomadStructuralVariants Short-read structural variants from gnomAD v4.1
 hg38 gnomadStructuralVariants lrSv Long-read structural variants across multiple cohorts
 hg38 lrSv dbVarSv NCBI dbVar structural variants (short-read and long-read, germline and clinical)
 hg38 dbVarSv lrSv Long-read structural variants across multiple cohorts
 hg38 lrSv dgvPlus Database of Genomic Variants (DGV) structural variation catalog
 hg38 dgvPlus lrSv Long-read structural variants across multiple cohorts
 hg38 lrSv giabSv Genome in a Bottle high-confidence SV benchmark callsets
 hg38 giabSv lrSv Long-read structural variants across multiple cohorts
+
+# PrimateAI-3D cross-links:
+hg38 primateAi alphaMissense AlphaMissense, a similar deep-learning missense pathogenicity predictor
+hg38 alphaMissense primateAi PrimateAI-3D, a similar deep-learning missense pathogenicity predictor using primate variation
+hg38 primateAi revel REVEL, an ensemble missense pathogenicity score built from multiple predictors
+hg38 revel primateAi PrimateAI-3D, a missense pathogenicity predictor using primate variation and 3D protein structure
+
+hg19 primateAi revel REVEL, an ensemble missense pathogenicity score built from multiple predictors
+hg19 revel primateAi PrimateAI-3D, a missense pathogenicity predictor using primate variation and 3D protein structure
+
+# PromoterAI cross-links:
+hg38 promoterAi primateAi PrimateAI-3D, a companion deep-learning model from Illumina for coding (missense) variants
+hg38 primateAi promoterAi PromoterAI, a companion deep-learning model from Illumina for non-coding promoter variants
+hg38 promoterAi alphaMissense AlphaMissense, a deep-learning predictor of missense (coding) variant pathogenicity
+hg38 alphaMissense promoterAi PromoterAI, a deep-learning predictor of expression-altering variants in promoter regions