68c5b3b5dfc4053ff78a6b1d236bd1ac90251cfa lrnassar Mon Jun 1 14:40:45 2026 -0700 varFreqs: description pages for the three combined tracks and "SNV" rename sweep. Add varFreqsDisease.html and varFreqsArray.html so the two new combined tracks have full Description/Display/Methods/Data Access/References. Add a Caveats section on varFreqsArray about chip-data quality vs sequencing. Update varFreqsAll.html and the supertrack varFreqs.html to reflect the three-combined-track family (cross-links between siblings, new "Combined Tracks" section, new table rows, and updated source/variant counts). Add a GoNL row to the supertrack table. Sweep 37 subtrack longLabels and four cross-referencing description pages (colorsDbSnv.html, mei.html, meiSwegen.html, phasedVars.html) from "Variant Frequencies:" to "SNV Frequencies:" to match the supertrack shortLabel. refs #36642 diff --git src/hg/makeDb/trackDb/human/varFreqs.html src/hg/makeDb/trackDb/human/varFreqs.html index fa9d6dbb231..bb8288f2744 100644 --- src/hg/makeDb/trackDb/human/varFreqs.html +++ src/hg/makeDb/trackDb/human/varFreqs.html @@ -1,76 +1,99 @@
This supertrack collects variant allele frequencies from population-scale sequencing and genotyping projects worldwide, from a total of ~1.7 million genomes/exomes/arrays. -The data was not reprocessed in a harmonized way; the variant VCFs were collected from the projects as-is. -The goal is a single place to compare how common -a variant is across different populations, ancestries, and cohorts, for -projects that cannot be recomputed by gnomAD soon. The main -combined track merges all databases into one summary track, -with filters, summed population frequencies and recalculated protein-effect annotations. -There is also one subtrack per project with the original VCF data and all the annotations that the project provides. -The different projects use different pipelines and sequencing technologies. Click any of the projects -above or below for a summary of their sample selection, sequencing assay and software pipeline. -Many projects do not allow us to distribute the data, but we document how to request it -and provide all converters.
+The data was not reprocessed in a harmonized way; the variant VCFs were collected from the +projects as-is. The goal is a single place to compare how common a variant is across +different populations, ancestries, and cohorts, for projects that cannot be recomputed by +gnomAD soon. Three combined tracks aggregate the source data along different lines, and +there is also one subtrack per project with the original VCF data and all the annotations +that the project provides. The different projects use different pipelines and sequencing +technologies. Click any of the projects above or below for a summary of their sample +selection, sequencing assay and software pipeline. Many projects do not allow us to +distribute the data, but we document how to request it and provide all converters. +Data from projects that provide haplotype-phased genotypes can also be found elsewhere: 1000 Genomes is also a separate track, and the phased genotypes HGDP, SGDP, HGDP+1000 Genomes and Mexico Biobank can also be found in the "Phased Variants" track. Their VCF versions below show only the isolate frequency per variant.
Please contact us (genome@soe.ucsc.edu) if you know of a project that we should add. So far, Regeneron's Million Exomes and Mexico City Studies (request rejected) and Taiwan Biobank (pending).
--The "All Databases Combined" track merges variants from all individual databases into a single -bigBed file with consequence annotations, totaling 1.17 billion variants from ~1.7 million individuals. -The track supports filtering by variant type -(SNV, insertion, deletion, MNV), predicted consequence (missense, synonymous, stop gained, -frameshift, splice, intron, intergenic), source database, allele frequency (overall maximum -and per-database), and allele count (total or per-database). The track is useful in dense mode -to get a quick overview of variant density across all projects, or with filters to find -variants present in specific databases or within certain frequency ranges. With the "clone track" -feature you can clone this track and keep multiple versions, each with different filters activated. -The "Density mode" checkbox on the track configuration page shows a plot of the -density of variants passing a filter, one per track clone. +Three combined tracks merge variants from the individual subtracks into single bigBed files +with predicted protein consequences and cross-database filtering. All three use the same +filter conventions (variant type, consequence, source database, allele frequency, allele +count, and per-database AF/AC).
+| Database | Region | N | Data Type | Cohort | Sub-populations | Downloadable from UCSC | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All Databases combined | -All below | -1.7mil | -WGS/WES/imputed | -- | + | All Databases Combined | +Sequencing-based, all below | +~1.7mil | +WGS/WES/long-read | +1.34B variants | +Phenotype splits for SPARK, SFARI WGS, GREGoR | +No | +
| Disease-related Databases Combined | +SPARK, SFARI WGS, TOPMed, SCHEMA, GREGoR, GA4K | +~300k | +WGS/WES/long-read | +932M variants | +SPARK ASD/Non-ASD, SFARI WGS ASD/Non-ASD, SCHEMA case/control, GREGoR aff/unaff/unknown | +No | +||||||
| Genotyping Array Databases Combined | +TPMI, MexBB, UKBB | +~530k | +Array / imputed | +14.7M variants | +— | No | ||||||
| AllOfUs v7 | USA | 245k | WGS | General population, diverse | African, Indigenous American, East Asian, European, Oceanian, South Asian (local ancestry; see Notes below) | Yes | ||||||
| TOPMED Freeze 10 | USA | @@ -122,30 +145,39 @@361k | Imputed array (HRC+UK10K+1KGp3 ref panel) | White British subset of UK Biobank, Neale Lab Round 2 GWAS | — | Yes | ||||||
| SweGen | Sweden | 1k | WGS | Cross-section of Swedish population | — | No | ||||||
| GoNL | +Netherlands | +498 | +WGS (~13x) | +250 unrelated Dutch trios (parents only) | +— | +Yes | +||||||
| SCHEMA | Multi-national | 121k | WES | Schizophrenia: 24k cases, 97k controls | — | Yes | ||||||
| Japan ToMMO 61k | Japan | 61k | WGS | General population |