86744c40b7e7f18792d287aedf9cf5da543e2d5a
max
  Fri Apr 17 07:22:27 2026 -0700
Add GA4K (Genomic Answers for Kids) small-variant subtrack to the
Variant Frequencies supertrack for hg38.
#Preview2 week - bugs introduced now will need a build patch to fix

Children's Mercy pediatric rare-disease cohort: ~36.2M SNVs and short
indels from 552 PacBio HiFi long-read samples (DeepVariant/GLnexus),
filtered to variants replicated in >=2 unrelated GA4K individuals or
an HPRC variant. Ref: Cohen et al. 2022, Genet Med, PMID 35305867.

refs #36642

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

diff --git src/hg/makeDb/doc/hg38/varFreqs.txt src/hg/makeDb/doc/hg38/varFreqs.txt
index 7de0e6a41e4..3051e5f2e12 100644
--- src/hg/makeDb/doc/hg38/varFreqs.txt
+++ src/hg/makeDb/doc/hg38/varFreqs.txt
@@ -1,15 +1,29 @@
+# Genomic Answers for Kids (GA4K), Children's Mercy - 2026-04-16 Claude max
+# GA4K is a pediatric rare-disease PacBio HiFi long-read cohort (Cohen et al.
+# 2022, Genet Med, PMID 35305867). The release ships 24 per-chromosome VCFs of
+# site-only small variants (SNVs and short indels), filtered to variants
+# replicated in >=2 unrelated GA4K individuals or matched to an HPRC variant.
+# Upstream data lives under /hive/data/genomes/hg38/bed/lrSv/GA4K (co-located
+# with the matched GA4K structural-variant release; see the lrSv makedoc).
+cd /hive/data/genomes/hg38/bed/lrSv/GA4K
+bcftools concat -Oz -o ga4kSnv.vcf.gz \
+    pacbio_snv_vcf/pb_joint_merged.snv.chr{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y}.vcf.gz
+tabix -p vcf ga4kSnv.vcf.gz
+# Symlinks placed under /gbdb/hg38/varFreqs/ga4k/ for the ga4kSnv stanza in
+# trackDb/human/varFreqs.ra.
+
 # Mexico Biobank, Max, Nov 8 2025
 CrossMap.py vcf /gbdb/hg19/liftOver/hg19ToHg38.over.chain.gz /hive
 /data/genomes/hg19/bed/varFreqs/mexbb/MXBv2.vcf.gz /hive/data/genomes/hg38/p14Clean/hg38.p14.fa MXBv2.lift.hg19ToHg38.vcf && bgzip MXBv2.lift.hg19ToHg38.vcf && bcftools sort MXBv2.lift.hg19ToHg38.vcf -Oz -m 200G -T /data/tmp/ -o MXBv2.lift.hg19ToHg38.vcf.gz && tabix -p vcf MXBv2.lift.hg19ToHg38.vcf.gz
 
 # Mexico City Prospective study, Max Oct 28 2025
 cd /hive/data/genomes/hg38/bed/varFreqs/mcps/
 for i in `seq 1 22` X; do wget https://rgc-mcps.regeneron.com/downloads/20230130/chr$i.freq.vcf.gz; done
 for i in `seq 1 22` X; do wget https://rgc-mcps.regeneron.com/downloads/20230130/chr$i.freq.vcf.gz.tbi; done
 mv *vcf* vcf/
 bcftools concat  --threads 16  -Oz -o mcps.freq.vcf.gz vcf/chr{1..22}.freq.vcf.gz vcf/chrX.freq.vcf.gz
 # make normal AC and AF and AN fields for mouseovers
 zcat mcps.freq.vcf.gz | sed -e 's/_RAW//g' > mcps.fix.freq.vcf
 mv -f mcps.fix.freq.vcf mcps.freq.vcf
 bgzip mcps.freq.vcf
 tabix -p vcf mcps.freq.vcf.gz