695f40f9d6139a4df393522c067f1702aff8d3bd
max
  Wed Apr 22 03:13:39 2026 -0700
varFreqs: add SVatalog 101 short-read SNV frequencies subtrack

SNV/indel allele frequencies from the 101-sample GWAS SVatalog cohort
(Chirmade et al. 2026, Heredity, PMID 41203876), called from 10X
Genomics linked short-read WGS with GATK HaplotypeCaller v4.0.0.0 and
phased with SHAPEIT v4.2.0. Sibling of the lrSv chirmade101Sv
structural-variant track, which is built from the same 101 samples.

8,814,835 autosomal + chrX sites. Source release ships only AF; AC and
AN are synthesized in the emitted VCF as AC=round(AF*202) and AN=202
(2*101 diploid), with the gnomAD v3.1 non-Finnish European AF and dbSNP
rsID passed through as GNOMAD_NFE_AF and RSID info fields. VCF is
bgzipped + tabix-indexed (172 MB + 1.6 MB .tbi).

Files:
- scripts/varFreqs/svatalogFreqToVcf.py (new): per-chrom allele-freq
TSV -> single VCF with hg38 ##contig header
- trackDb/human/varFreqs.ra: new svatalogSnv vcfTabix subtrack
- trackDb/human/svatalogSnv.html (new): doc page
- trackDb/human/varFreqs.html: new row in Available Datasets table
- doc/hg38/varFreqs.txt: wget-free build block (input files were
downloaded manually from Zenodo 13367574)

Note: the All Databases Combined varFreqs bigBed has NOT been rebuilt
to include this new source yet; a subsequent merge pass will add it.

refs #36258

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

diff --git src/hg/makeDb/trackDb/human/varFreqs.ra src/hg/makeDb/trackDb/human/varFreqs.ra
index e336ad0a36c..62550f60d7b 100644
--- src/hg/makeDb/trackDb/human/varFreqs.ra
+++ src/hg/makeDb/trackDb/human/varFreqs.ra
@@ -478,15 +478,24 @@
         parent varFreqs on
         bigDataUrl /gbdb/$D/varFreqs/ga4k/ga4kSnv.vcf.gz
         visibility pack
         priority 9
 
         track colorsDbSnv
         shortLabel CoLoRSdb 1,027 LR SNVs/indels
         longLabel Variant Frequencies: CoLoRSdb v1.2.0 - 1,027 long-read PacBio HiFi WGS, DeepVariant+GLnexus joint-called SNVs and small indels
         type vcfTabix
         parent varFreqs on
         bigDataUrl /gbdb/$D/varFreqs/colorsDb/colorsDbSnv.vcf.gz
         visibility pack
         dataVersion v1.2.0
         priority 9.5
 
+        track svatalogSnv
+        shortLabel SVatalog 101 WGS
+        longLabel Variant Frequencies: GWAS SVatalog SNPs from 101 Samples (Chirmade 2026, 10X Genomics linked short-reads)
+        type vcfTabix
+        parent varFreqs on
+        bigDataUrl /gbdb/$D/varFreqs/svatalog/svatalog.vcf.gz
+        visibility pack
+        priority 10
+