d4951d6de0335238ce124b3fb9703d82d329b1ab max Sat Jun 13 06:35:27 2026 -0700 html updates to varFreqs, refs #36642 diff --git src/hg/makeDb/trackDb/human/varFreqs.html src/hg/makeDb/trackDb/human/varFreqs.html index 3c715c1b35f..44840eb1a7c 100644 --- src/hg/makeDb/trackDb/human/varFreqs.html +++ src/hg/makeDb/trackDb/human/varFreqs.html @@ -1,40 +1,43 @@

Description

This track collection gathers variant allele frequencies from population-scale sequencing and genotyping projects worldwide, from a total of ~1.7 million genomes/exomes/arrays. -The data was not reprocessed in a harmonized way; the variant VCFs were collected from the +Unlike gnomAD, the data was not reprocessed in a harmonized way; the variant VCFs were collected from the projects as-is. The goal is a single place to compare how common a variant is across -different populations, ancestries, and cohorts, for projects that cannot be recomputed by -gnomAD soon. Three combined tracks aggregate the source data along different lines, and +different populations, ancestries, and cohorts, for projects that gnomAD is unlikely to +reprocess soon. Three combined tracks aggregate the source data along different lines, and there is also one subtrack per project with the original VCF data and all the annotations that the project provides. The different projects use different pipelines and sequencing technologies. Click any of the projects above or below for a summary of their sample selection, sequencing assay and software pipeline. Many projects do not allow us to -distribute the data, but we document how to request it and provide all converters. +distribute the data, but we document how to request it and provide all converters, see Data Download below.

-Data from projects that provide haplotype-phased genotypes can also be found -elsewhere: 1000 Genomes is also a separate track, and the phased genotypes HGDP, SGDP, -HGDP+1000 Genomes and Mexico Biobank can also be found in the "Phased Variants" track. -Their VCF versions below show only the isolate frequency per variant. +The browser has other tracks with variant frequencies. We have of course the data +from gnomAD in separate tracks. Two projects that +provide haplotype-phased genotypes can also be found in their own tracks: +1000 Genomes is a separate track, and the phased +genotypes HGDP, SGDP, HGDP+1000 Genomes and Mexico Biobank are in the +Phased Variants track. Their VCF versions below show +only the allele frequency per variant, not the phased genotypes.

Please contact us (genome@soe.ucsc.edu) if you know of a project that we should add. So far, -Regeneron's Million Exomes and Mexico City Studies (request rejected) and Taiwan Biobank (pending). -

+we have requested data from Regeneron's Million Exomes and the Mexico City studies (both requests rejected); +Taiwan Biobank and the full UK Biobank WGS data requests are pending.

Combined Tracks

Three combined tracks merge variants from the individual subtracks into single bigBed files with predicted protein consequences and cross-database filtering. All three use the same filter conventions (variant type, consequence, source database, allele frequency, allele count, and per-database AF/AC).