50466766840ded6cb8bd5cb868bdf2ff3f613bc0 lrnassar Tue Apr 21 11:17:15 2026 -0700 QA fixes for PrimateAI-3D track. Config (primateAi.ra): - Fix broken Ensembl transcript linkout: urls $S expanded to chromosome name; switch to the Ensembl transcript page with $$ - Add numeric filters on percentile and raw score (label notes the paper's 0.821 clinical threshold) - Add maxWindowToDraw 2000000 Data (primateAiToBigBed.py): - Change hardcoded strand '+' to '.': the source file has no strand column - Accept input/output paths as CLI args (previously hardcoded the hg38 input path) - Handle variable field count: ~2.4M rows in the hg19 source are missing the refseq column Description (primateAi.html): - Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way track - Regenerate the first reference via getTrackReferences (wrong article number and wrong PMC ID in the previous text) - Fix the GitHub URL for the conversion script in Methods - Move the Zoonomia 447-way mention out of Description; rephrase the license note to describe precisely what is disabled relatedTracks.ra: - Add reciprocal cross-links for primateAi <-> alphaMissense (hg38), primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi (hg38). Also includes promoterAi <-> alphaMissense cross-links. refs #37274 #37279 diff --git src/hg/makeDb/trackDb/relatedTracks.ra src/hg/makeDb/trackDb/relatedTracks.ra index 4930e91da8d..f0b122a368e 100644 --- src/hg/makeDb/trackDb/relatedTracks.ra +++ src/hg/makeDb/trackDb/relatedTracks.ra @@ -1,105 +1,120 @@ # A space delimited file of track relatedness. All entries must be reciprocal. Format: # ucscDb track trackLinkingTo reason # hg38: hg38 knownGene knownGeneArchive View previous versions of GENCODE Genes hg38 knownGeneArchive knownGene View the latest GENCODE Genes version hg38 miRnaAtlas nonCodingRNAs View associated precursor miRnas hg38 nonCodingRNAs miRnaAtlas View expression of cleaved miRnas hg38 caddSuper gnomad View associated variants hg38 gnomad caddSuper View CADD scores for this variant and region hg38 constraintSuper gnomadPLI Predicted constraint metrics from gnomAD hg38 gnomadPLI constraintSuper Container track of various constraint scores hg38 gnomadStr strVar A collection of population-level STR variation tracks across the genome hg38 strVar gnomadStr Population-level STR variation across disease-associated loci from gnomAD v3.1.3 hg38 revel liftHg38 Revel is based on hg19 and lifted to hg38. liftOver "chain" alignment from hg19 to hg38 hg38 liftHg38 revel Revel scores were lifted using UCSC liftOver chains from hg38 hg38 revel caddSuper CADD, a similar deleteriousness score, and not used as an input by REVEL hg38 caddSuper revel REVEL, a similar deleteriousness score hg38 liftHg19 grcIncidentDb GRC Incident database, to explore reasons why the assembly was changed hg38 grcIncidentDb liftHg19 LiftOver for hg38, explores how incident regions aligned between human assemblies hg38 ReMap liftHg19 NCBI ReMap, even though it has the same name, is a liftOver-like hg19/hg38 alignment, and unrelated to the ReMap database hg38 liftHg19 ReMap ReMap, even though it has the same name, is a database of transcription factor binding sites, unrelated to NCBI ReMap hg38 ReMap jaspar JASPAR is a database of predicted TF binding sites, based on short DNA matches. Unlike ReMap, the data is purely computational. hg38 jaspar ReMap ReMap is a database of TF binding sites inferred from ChIP-Seq Data. Unlike JASPAR predictions, these sites are supported by functional assay hg38 problematic mappability The mappability track contains regions where short sequencing reads are hard to align hg38 mappability problematic The problematic regions track contains various gene clusters and the ENCODE blacklist hg38 problematic grcIncidentDb The GRC (Genome Reference Consortium) incidents track contains regions that were flagged by the group that puts together the genome hg38 grcIncidentDb problematic The problematic regions track lists unusual regions and the ones that often lead to artefacts when aligning reads to the reference genome hg38 phasedVars varFreqs The variant frequencies track contains projects where variant frequencies, aka allele frequencies, are publicly available. hg38 varFreqs phasedVars The phased variants track contains projects that provide haplotype-phased genotypes/variants. hg38 wgEncodeReg4 wgEncodeReg Previous ENCODE3 Regulation track hg38 wgEncodeReg wgEncodeReg4 New ENCODE4 Regulation track hg38 wgEncodeReg4 cCREs Related ENCODE4 cCRE annotations hg38 cCREs wgEncodeReg4 Related ENCODE4 regulation data hg38 avada varaico The AVADA track is no longer updated. See VARAICO for the latest variants mined from papers. hg38 varaico avada Previous literature mining track for variants extracted from publications. No longer updated. # hg19: hg19 caddSuper gnomad View associated variants hg19 gnomad caddSuper View CADD scores for this variant and region hg19 decipherHaploIns gnomadPLI Compare haploinsufficiency metrics as defined by gnomAD hg19 gnomadPLI decipherHaploIns Compare constraint metrics as defined by DECIPHER hg19 revel caddSuper CADD, a similar deleteriousness score, and not used as an input by REVEL hg19 caddSuper revel REVEL, a similar deleteriousness score hg19 liftHg38 grcIncidentDb GRC Incident database, to explore reasons why the assembly was changed hg19 grcIncidentDb liftHg38 LiftOver alignments between hg38 and hg38 to explore how the GRC incident assembly changes affect whole-genome alignments between hg19 and hg38 used for lifting data from hg19 hg19 fixSeqLiftOverPsl liftHg38 Investigate how patches affect the whole-genome alignment used for liftOver hg19 liftHg38 fixSeqLiftOverPsl Investigate how assembly patches affect the liftOver alignment hg19 liftHg38 hg38ContigDiff Hg38 Diff shows contigs that were changed from hg19 to hg38 hg19 hg38ContigDiff liftHg38 Investigate how contig changes affect the liftOver alignments hg19 jaspar ReMap ReMap is a database of TF binding sites inferred from ChIP-Seq Data. Unlike JASPAR predictions, these sites are supported by functional assay hg19 ReMap jaspar JASPAR is a database of predicted TF binding sites, based on short DNA matches. Unlike ReMap, the data is purely computational. hg19 ReMap liftHg38 NCBI ReMap, even though it has the same name, is a liftOver-like hg19/hg38 alignment, and unrelated to the ReMap database hg19 liftHg38 ReMap ReMap, even though it has the same name, is a database of transcription factor binding sites, unrelated to NCBI ReMap hg19 refSeqComposite pseudoYale60 NCBI RefSeq Curated and RefSeq Other contains pseudogenes, but the Yale annotation should be more comprehensive for this transcript type hg19 pseudoYale60 refSeqComposite NCBI RefSeq Curated and RefSeq Other also contain some transcribed and untranscribed pseudogenes, respectively. hg19 constraintSuper gnomadPLI Predicted constraint metrics from gnomAD hg19 gnomadPLI constraintSuper Container track of various constraint scores hg19 avada varaico The AVADA track is no longer updated. See VARAICO for the latest variants mined from papers. hg19 varaico avada Previous literature mining track for variants extracted from publications. No longer updated. # mm39: mm39 knownGene knownGeneArchive View previous versions of GENCODE Genes mm39 knownGeneArchive knownGene View the latest GENCODE Genes version # mm10 ENCODE4 Regulation: mm10 encode4Reg encode3Reg Previous ENCODE3 Regulation track mm10 encode3Reg encode4Reg New ENCODE4 Regulation track mm10 encode4Reg cCREs Related ENCODE4 cCRE annotations mm10 cCREs encode4Reg Related ENCODE4 regulation data # hg38 long-read SV supertrack cross-links to other SV resources: hg38 lrSv gnomadStructuralVariants Short-read structural variants from gnomAD v4.1 hg38 gnomadStructuralVariants lrSv Long-read structural variants across multiple cohorts hg38 lrSv dbVarSv NCBI dbVar structural variants (short-read and long-read, germline and clinical) hg38 dbVarSv lrSv Long-read structural variants across multiple cohorts hg38 lrSv dgvPlus Database of Genomic Variants (DGV) structural variation catalog hg38 dgvPlus lrSv Long-read structural variants across multiple cohorts hg38 lrSv giabSv Genome in a Bottle high-confidence SV benchmark callsets hg38 giabSv lrSv Long-read structural variants across multiple cohorts + +# PrimateAI-3D cross-links: +hg38 primateAi alphaMissense AlphaMissense, a similar deep-learning missense pathogenicity predictor +hg38 alphaMissense primateAi PrimateAI-3D, a similar deep-learning missense pathogenicity predictor using primate variation +hg38 primateAi revel REVEL, an ensemble missense pathogenicity score built from multiple predictors +hg38 revel primateAi PrimateAI-3D, a missense pathogenicity predictor using primate variation and 3D protein structure + +hg19 primateAi revel REVEL, an ensemble missense pathogenicity score built from multiple predictors +hg19 revel primateAi PrimateAI-3D, a missense pathogenicity predictor using primate variation and 3D protein structure + +# PromoterAI cross-links: +hg38 promoterAi primateAi PrimateAI-3D, a companion deep-learning model from Illumina for coding (missense) variants +hg38 primateAi promoterAi PromoterAI, a companion deep-learning model from Illumina for non-coding promoter variants +hg38 promoterAi alphaMissense AlphaMissense, a deep-learning predictor of missense (coding) variant pathogenicity +hg38 alphaMissense promoterAi PromoterAI, a deep-learning predictor of expression-altering variants in promoter regions