64a3f9e7813e823cf724ea188c3928a911578286
max
  Thu Jun 4 00:32:22 2026 -0700
varFreqs: replace All Databases Combined with two phenotype-split tracks

Replace the single varFreqsAll combined track (and drop the varFreqsDisease
track) with two matched tracks for visual case-vs-background comparison:
varFreqsAffected   - variants seen in the affected/case arms of disease
cohorts (SFARI SPARK WES/WGS ASD probands, SCHEMA cases,
GREGoR affected, GA4K); ~130,000 individuals
varFreqsBackground - population reference cohorts + the unaffected/control
arms of disease cohorts ("all other variants");
~1.5 million individuals
A variant seen in both groups appears in both tracks. Genotyping-array cohorts
stay out of both (varFreqsArray unchanged).

vcfToBigBed.py gains --split-affected to emit both tracks in one pass; it reads
phenotype tags (affected/unaffected/unknown) from populations.tsv and
is_disease/disease_role from databases.tsv, and derives the length-filter
ranges from the observed data. TOPMed reclassified as a population cohort.
SPARK WGS display name changed to SFARI SPARK WGS for consistency with the
standalone subtracks. Fixed the trackDb mouseOver $-substitution prefix
collision by wrapping fields in ${}. New description pages for both tracks.

refs #36642

diff --git src/hg/makeDb/scripts/varFreqs/databases.tsv src/hg/makeDb/scripts/varFreqs/databases.tsv
index a0fccdd9da7..dccc7af0731 100644
--- src/hg/makeDb/scripts/varFreqs/databases.tsv
+++ src/hg/makeDb/scripts/varFreqs/databases.tsv
@@ -1,36 +1,42 @@
 # Database configuration for varFreqsAll combined track
-# key	name	vcf	ac_field	af_field
+# key	name	vcf	ac_field	af_field	is_disease	disease_role
 # Use "." for fields that don't exist in the VCF
-AllOfUs	AllOfUs	/gbdb/hg38/varFreqs/_allofus/allOfUs.locAncFreq.vcf.gz	.	.
-SPARK	SPARK WES	/gbdb/hg38/varFreqs/_sfari/SPARK.iWES_v3.2024_08.deepvariant.norm.vcf.gz	AC	AF
-SFARI_WGS	SFARI WGS	/gbdb/hg38/varFreqs/_sfari/wgs_12519_genome.deepvariant.norm.vcf.gz	AC	AF
-GenomeAsia	GenomeAsia SNVs	/gbdb/hg38/varFreqs/ga100k/ga100k.subst.vcf.gz	AC	AF
-GenomeAsiaIndel	GenomeAsia Indels	/gbdb/hg38/varFreqs/ga100k/ga100k.indels.vcf.gz	AC	AF
-NPM	NPM Singapore	/gbdb/hg38/varFreqs/_npm/SG10K_Health_r5.3.2.sites.vcf.bgz	AC	AF
-KOVA	KOVA Korea	/gbdb/hg38/varFreqs/_kova/kova.v7.vcf.gz	AC	AF
-ToMMo	ToMMo Japan	/gbdb/hg38/varFreqs/tommo61kjpn/tommo-61kjpn-20250616-GRCh38-snvindel-af-autosome.vcf.gz	AC	AF
+# is_disease=1: cohort assembled to study a disease (autism, schizophrenia, rare disease).
+# disease_role: for a disease cohort with NO affected/unaffected population split, what is
+#   the whole cohort? "affected" (e.g. GA4K rare-disease probands) feeds the affected
+#   summary; blank means use the per-population phenotype tags in populations.tsv instead.
+# TOPMed is is_disease=0: it is an NHLBI population/biobank reference (used like gnomAD),
+#   not an affected-disease case cohort, and ships no affected/unaffected label.
+AllOfUs	AllOfUs	/gbdb/hg38/varFreqs/_allofus/allOfUs.locAncFreq.vcf.gz	.	.	0
+SPARK	SFARI SPARK WES	/gbdb/hg38/varFreqs/_sfari/SPARK.iWES_v3.2024_08.deepvariant.norm.vcf.gz	AC	AF	1
+SFARI_WGS	SFARI SPARK WGS	/gbdb/hg38/varFreqs/_sfari/wgs_12519_genome.deepvariant.norm.vcf.gz	AC	AF	1
+GenomeAsia	GenomeAsia SNVs	/gbdb/hg38/varFreqs/ga100k/ga100k.subst.vcf.gz	AC	AF	0
+GenomeAsiaIndel	GenomeAsia Indels	/gbdb/hg38/varFreqs/ga100k/ga100k.indels.vcf.gz	AC	AF	0
+NPM	NPM Singapore	/gbdb/hg38/varFreqs/_npm/SG10K_Health_r5.3.2.sites.vcf.bgz	AC	AF	0
+KOVA	KOVA Korea	/gbdb/hg38/varFreqs/_kova/kova.v7.vcf.gz	AC	AF	0
+ToMMo	ToMMo Japan	/gbdb/hg38/varFreqs/tommo61kjpn/tommo-61kjpn-20250616-GRCh38-snvindel-af-autosome.vcf.gz	AC	AF	0
 # IndiGen dropped: the IGIB IndiGenomes release ships only a VRT variation-type
 # bit per record (no AC, AF, or AN in INFO), so it cannot contribute counts to
 # the combined track. Re-add only if a future release exposes allele counts.
-FinnGen	FinnGen Finland	/gbdb/hg38/varFreqs/_finngen/finnge_R12_annotated_variants_v1.vcf.gz	AC	AF
-Saudi	Saudi	/gbdb/hg38/varFreqs/saudi/saudi.vcf.gz	AC	AF
-SweGen	SweGen Sweden	/gbdb/hg38/varFreqs/_swefreq/swegen_frequencies_fixploidy_GRCh38_20190204.vcf.gz	AC	AF
-TOPMed	TOPMed	/gbdb/hg38/varFreqs/_topmed/topmed10.vcf.gz	AC	AF
-ABraOM	ABraOM Brazil	/gbdb/hg38/varFreqs/abraom/abraom.vcf.gz	.	AF
-ALFA	ALFA	/gbdb/hg38/varFreqs/alfa/ALFA.vcf.gz	.	AF_GLB
-MGRB	MGRB Australia	/gbdb/hg38/varFreqs/_mgrb/MGRB.phase3.GRCh38.norm.vcf.gz	AC	.
-HRC	HRC	/gbdb/hg38/varFreqs/hrc/hrc.vcf.gz	AC	AF
+FinnGen	FinnGen Finland	/gbdb/hg38/varFreqs/_finngen/finnge_R12_annotated_variants_v1.vcf.gz	AC	AF	0
+Saudi	Saudi	/gbdb/hg38/varFreqs/saudi/saudi.vcf.gz	AC	AF	0
+SweGen	SweGen Sweden	/gbdb/hg38/varFreqs/_swefreq/swegen_frequencies_fixploidy_GRCh38_20190204.vcf.gz	AC	AF	0
+TOPMed	TOPMed	/gbdb/hg38/varFreqs/_topmed/topmed10.vcf.gz	AC	AF	0
+ABraOM	ABraOM Brazil	/gbdb/hg38/varFreqs/abraom/abraom.vcf.gz	.	AF	0
+ALFA	ALFA	/gbdb/hg38/varFreqs/alfa/ALFA.vcf.gz	.	AF_GLB	0
+MGRB	MGRB Australia	/gbdb/hg38/varFreqs/_mgrb/MGRB.phase3.GRCh38.norm.vcf.gz	AC	.	0
+HRC	HRC	/gbdb/hg38/varFreqs/hrc/hrc.vcf.gz	AC	AF	0
 # MexBB and TPMI moved to the array-based track (databases_array.tsv): both are
 # genotyping-array cohorts and are kept out of the WGS/WES varFreqsAll track.
-SGDP	SGDP	/gbdb/hg38/varFreqs/sgdpFreq/sgdp.freq.vcf.gz	AC	AF
-HGDP1kG	gnomAD HGDP+1kG	/gbdb/hg38/varFreqs/hgdp1kFreq/hgdp1k.freq.vcf.gz	AC	AF
-GREGoR	GREGoR	/gbdb/hg38/varFreqs/gregor/gregor.vcf.gz	AC	AF
-SCHEMA	SCHEMA	/gbdb/hg38/varFreqs/schema/SCHEMA_variant_results_withAF.vcf.gz	AC	AF
-GA4K	GA4K PacBio LR	/gbdb/hg38/varFreqs/ga4k/ga4kSnv.vcf.gz	AC	AF
-CoLoRSdb	CoLoRSdb PacBio LR	/gbdb/hg38/varFreqs/colorsDb/colorsDbSnv.vcf.gz	AC	AF
-SVatalog	SVatalog 101 10XG SR	/gbdb/hg38/varFreqs/svatalog/svatalog.vcf.gz	AC	AF
-Tishkoff180	Tishkoff 180 African WGS	/gbdb/hg38/varFreqs/_tishkoff/tishkoff180.vcf.gz	AC	AF
-WBBC	WBBC China	/gbdb/hg38/varFreqs/wbbc/wbbc.vcf.gz	AC	AF
-ChinaMAP	China ChinaMAP	/gbdb/hg38/varFreqs/_chinamap/chinamap.vcf.gz	AC	AF
-GenomeIndia	GenomeIndia 9.7k WGS	/gbdb/hg38/varFreqs/_genomeindia/genomeindia.vcf.gz	AC	AF
-GoNL	GoNL Netherlands ~13x SR	/gbdb/hg38/varFreqs/gonl/gonl.vcf.gz	AC	AF
+SGDP	SGDP	/gbdb/hg38/varFreqs/sgdpFreq/sgdp.freq.vcf.gz	AC	AF	0
+HGDP1kG	gnomAD HGDP+1kG	/gbdb/hg38/varFreqs/hgdp1kFreq/hgdp1k.freq.vcf.gz	AC	AF	0
+GREGoR	GREGoR	/gbdb/hg38/varFreqs/gregor/gregor.vcf.gz	AC	AF	1
+SCHEMA	SCHEMA	/gbdb/hg38/varFreqs/schema/SCHEMA_variant_results_withAF.vcf.gz	AC	AF	1
+GA4K	GA4K PacBio LR	/gbdb/hg38/varFreqs/ga4k/ga4kSnv.vcf.gz	AC	AF	1	affected
+CoLoRSdb	CoLoRSdb PacBio LR	/gbdb/hg38/varFreqs/colorsDb/colorsDbSnv.vcf.gz	AC	AF	0
+SVatalog	SVatalog 101 10XG SR	/gbdb/hg38/varFreqs/svatalog/svatalog.vcf.gz	AC	AF	0
+Tishkoff180	Tishkoff 180 African WGS	/gbdb/hg38/varFreqs/_tishkoff/tishkoff180.vcf.gz	AC	AF	0
+WBBC	WBBC China	/gbdb/hg38/varFreqs/wbbc/wbbc.vcf.gz	AC	AF	0
+ChinaMAP	China ChinaMAP	/gbdb/hg38/varFreqs/_chinamap/chinamap.vcf.gz	AC	AF	0
+GenomeIndia	GenomeIndia 9.7k WGS	/gbdb/hg38/varFreqs/_genomeindia/genomeindia.vcf.gz	AC	AF	0
+GoNL	GoNL Netherlands ~13x SR	/gbdb/hg38/varFreqs/gonl/gonl.vcf.gz	AC	AF	0