ec5c73f4dc3ef4beae16fa1c12b7e5bf872bb73d lrnassar Tue May 5 15:04:39 2026 -0700 varFreqs: fix gaspIndel bigDataUrl after Max's GenomeAsia hg38 lift; add Tishkoff180 to combined-track filter UI; sync databases.tsv with deployed bigBed; minor description-page corrections. refs #36642 GenomeAsia hg38 lift (May 5 2026, by Max): - gaspIndel.bigDataUrl was pointing at the old GRCh37 filename "All.indels.annot.cont_withmaf.vcf.gz" which was renamed to "ga100k.indels.vcf.gz" during the lift; this left the gaspIndel track broken on the sandbox until the trackdb stanza was updated to match. - gasp/gaspIndel dataVersion strings updated from "Pilot 2019 (GRCh37 - to be lifted)" to "Pilot 2019 (lifted to hg38, May 2026)". - databases.tsv: also updated GenomeAsiaIndel path to ga100k.indels.vcf.gz so the next varFreqsAll rebuild reads from the lifted file. Tishkoff180 in varFreqsAll.bb but unfilterable (fresh-eyes audit finding): - Added Tishkoff180 to filterValues.sources and added filterByRange.Tishkoff180AF / Tishkoff180AC entries. - Added Tishkoff180 (and SVatalog) rows to databases.tsv to match the deployed bigBed (which already has those columns). Description-page corrections: - varFreqsAll.html: "20 population databases" -> "25 source databases" (matches actual count); HGDP+1kG bullet "European" -> "Non-Finnish European" to disambiguate from Finnish (gnomAD's nfe). - varFreqs.html: GenomeAsia row in the Available Datasets table updated from 3 to 7 sub-populations (NEA/SEA/SAS plus the previously hidden OCE/AMR/AFR/WER) so the table matches what the data exposes once Max's rebuild populates the new filter columns. - KOVA longLabel: "1.9k WGS+3.5k WES" -> "1.9k WGS+3.4k WES" (3.4k is correct per Lee 2017 and kova.html). diff --git src/hg/makeDb/trackDb/human/varFreqs.ra src/hg/makeDb/trackDb/human/varFreqs.ra index 339c35c94f8..571fb09828b 100644 --- src/hg/makeDb/trackDb/human/varFreqs.ra +++ src/hg/makeDb/trackDb/human/varFreqs.ra @@ -11,31 +11,31 @@ longLabel Variant Frequencies: All Databases Combined with Consequence Annotations type bigBed 9 + parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/varFreqsAll.bb visibility pack itemRgb on maxWindowToDraw 5000000 priority 0.1 #mouseOver $aaChange $dnaChange # Variant type and consequence filters filterValues.varType SNV|SNV,INS|Insertion,DEL|Deletion,MNV|MNV filterLabel.varType Variant Type filterValues.consequence missense|Missense,synonymous|Synonymous,stop_gained|Stop Gained,frameshift|Frameshift,splice_donor|Splice Donor,splice_acceptor|Splice Acceptor,intron|Intron,.|Intergenic filterLabel.consequence Consequence # Source database filter - filterValues.sources AllOfUs|AllOfUs,SPARK|SPARK WES,SFARI_WGS|SFARI WGS,GenomeAsia|GenomeAsia SNVs,GenomeAsiaIndel|GenomeAsia Indels,KOVA|KOVA Korea,ToMMo|ToMMo Japan,IndiGen|IndiGenomes India,FinnGen|FinnGen Finland,Saudi|Saudi,SweGen|SweGen Sweden,TOPMed|TOPMed,ABraOM|ABraOM Brazil,ALFA|ALFA,MGRB|MGRB Australia,HRC|HRC,MexBB|Mexico Biobank,SGDP|SGDP,HGDP1kG|gnomAD HGDP+1kG,GREGoR|GREGoR,SCHEMA|SCHEMA,GA4K|GA4K PacBio LR,CoLoRSdb|CoLoRSdb PacBio LR,SVatalog|SVatalog 101 10XG SR + filterValues.sources AllOfUs|AllOfUs,SPARK|SPARK WES,SFARI_WGS|SFARI WGS,GenomeAsia|GenomeAsia SNVs,GenomeAsiaIndel|GenomeAsia Indels,KOVA|KOVA Korea,ToMMo|ToMMo Japan,IndiGen|IndiGenomes India,FinnGen|FinnGen Finland,Saudi|Saudi,SweGen|SweGen Sweden,TOPMed|TOPMed,ABraOM|ABraOM Brazil,ALFA|ALFA,MGRB|MGRB Australia,HRC|HRC,MexBB|Mexico Biobank,SGDP|SGDP,HGDP1kG|gnomAD HGDP+1kG,GREGoR|GREGoR,SCHEMA|SCHEMA,GA4K|GA4K PacBio LR,CoLoRSdb|CoLoRSdb PacBio LR,SVatalog|SVatalog 101 10XG SR,Tishkoff180|Tishkoff 180 African WGS filterType.sources multipleListOr filterLabel.sources Source Database # Length filters filterByRange.refLen on filterLabel.refLen Reference Length filterByRange.altLen on filterLabel.altLen Alternate Length filterByRange.varLen on filterLabel.varLen Length Change # Max AF filter filterByRange.maxAF on filterLabel.maxAF Max Allele Frequency filterLimits.maxAF 0:1 # Total AC filter filterByRange.totalAC on @@ -77,30 +77,32 @@ filterLabel.MexBBAF Mexico Biobank AF filterByRange.SGDPAF on filterLabel.SGDPAF SGDP AF filterByRange.HGDP1kGAF on filterLabel.HGDP1kGAF gnomAD HGDP+1kG AF (4k cohort) filterByRange.GREGoRAF on filterLabel.GREGoRAF GREGoR AF filterByRange.SCHEMAAF on filterLabel.SCHEMAAF SCHEMA AF filterByRange.GA4KAF on filterLabel.GA4KAF GA4K PacBio LR AF filterByRange.CoLoRSdbAF on filterLabel.CoLoRSdbAF CoLoRSdb PacBio LR AF filterByRange.SVatalogAF on filterLabel.SVatalogAF SVatalog 101 10XG SR AF + filterByRange.Tishkoff180AF on + filterLabel.Tishkoff180AF Tishkoff 180 African WGS AF # Per-database AC filters filterByRange.AllOfUsAC on filterLabel.AllOfUsAC AllOfUs AC filterByRange.SPARKAC on filterLabel.SPARKAC SPARK WES AC filterByRange.SFARI_WGSAC on filterLabel.SFARI_WGSAC SFARI WGS AC filterByRange.GenomeAsiaAC on filterLabel.GenomeAsiaAC GenomeAsia SNVs AC filterByRange.GenomeAsiaIndelAC on filterLabel.GenomeAsiaIndelAC GenomeAsia Indels AC filterByRange.KOVAAC on filterLabel.KOVAAC KOVA Korea AC filterByRange.ToMMoAC on filterLabel.ToMMoAC ToMMo Japan AC @@ -126,30 +128,32 @@ filterLabel.MexBBAC Mexico Biobank AC filterByRange.SGDPAC on filterLabel.SGDPAC SGDP AC filterByRange.HGDP1kGAC on filterLabel.HGDP1kGAC gnomAD HGDP+1kG AC (4k cohort) filterByRange.GREGoRAC on filterLabel.GREGoRAC GREGoR AC filterByRange.SCHEMAAC on filterLabel.SCHEMAAC SCHEMA AC filterByRange.GA4KAC on filterLabel.GA4KAC GA4K PacBio LR AC filterByRange.CoLoRSdbAC on filterLabel.CoLoRSdbAC CoLoRSdb PacBio LR AC filterByRange.SVatalogAC on filterLabel.SVatalogAC SVatalog 101 10XG SR AC + filterByRange.Tishkoff180AC on + filterLabel.Tishkoff180AC Tishkoff 180 African WGS AC # Population-specific AF filters # AllOfUs local-ancestry populations # NB: these are local-ancestry-stratified frequencies (per-position, per-haplotype-class), # NOT the AllOfUs paper's global Rye ancestry categories. See varFreqs.html for details. filterByRange.AllOfUsAF_AFR on filterLabel.AllOfUsAF_AFR AllOfUs African AF (local ancestry) filterByRange.AllOfUsAF_AMR on filterLabel.AllOfUsAF_AMR AllOfUs Indigenous American AF (local ancestry) filterByRange.AllOfUsAF_EAS on filterLabel.AllOfUsAF_EAS AllOfUs East Asian AF (local ancestry) filterByRange.AllOfUsAF_EUR on filterLabel.AllOfUsAF_EUR AllOfUs European AF (local ancestry) filterByRange.AllOfUsAF_OCE on filterLabel.AllOfUsAF_OCE AllOfUs Oceanian AF (local ancestry) filterByRange.AllOfUsAF_SAS on @@ -367,63 +371,63 @@ type vcfTabix parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/mgrb/MGRB.phase3.GRCh38.norm.vcf.gz dataVersion Phase 3 visibility dense # no downloads as per Matt Hobbs email Jan 28 2026 tableBrowser off track gasp shortLabel GenomeAsia 1.7k SNVs longLabel Variant Frequencies: GenomeAsia Pilot - Substitutions type vcfTabix parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/ga100k/ga100k.subst.vcf.gz visibility dense - dataVersion Pilot 2019 (GRCh37 - to be lifted) + dataVersion Pilot 2019 (lifted to hg38, May 2026) track gaspIndel shortLabel GenomeAsia 1.7k Indels longLabel Variant Frequencies: GenomeAsia Pilot - Indels type vcfTabix parent varFreqs on - bigDataUrl /gbdb/$D/varFreqs/ga100k/All.indels.annot.cont_withmaf.vcf.gz + bigDataUrl /gbdb/$D/varFreqs/ga100k/ga100k.indels.vcf.gz visibility dense - dataVersion Pilot 2019 (GRCh37 - to be lifted) + dataVersion Pilot 2019 (lifted to hg38, May 2026) html gasp track abraom shortLabel Brazil ABraOM 1k WGS longLabel Variant Frequencies: ABraOM Brazil - 1,171 unrelated individuals type vcfTabix parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/abraom/abraom.vcf.gz visibility dense dataVersion SABE-WGS-1171 Sep 2020 track indigenomes shortLabel India IndiGenomes 1k WGS longLabel Variant Frequencies: IndiGenomes India - 1,029 samples type vcfTabix parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/indigenomes/IndiGenomes_Variants.vcf.gz visibility dense dataVersion IndiGen pilot (Jain 2021) track kova shortLabel Korea KOVA 5.3k mixed - longLabel Variant Frequencies: KOVA Korea - 5305 samples, 1.9k WGS+3.5k WES + longLabel Variant Frequencies: KOVA Korea - 5305 samples, 1.9k WGS+3.4k WES type vcfTabix parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/kova/kova.v7.vcf.gz visibility dense tableBrowser off dataVersion V7 track npm shortLabel Singapore NPM 9.7k WGS longLabel Variant Frequencies: NPM Singapore - 9,770 WGS samples type vcfTabix parent varFreqs on bigDataUrl /gbdb/$D/varFreqs/npm/SG10K_Health_r5.3.2.sites.vcf.bgz visibility dense tableBrowser off