aa61ebc800429515f9ced7e28f669c6042219f43 max Wed Mar 18 09:09:13 2026 -0700 varFreqs supertrack: add GREGoR track, update all HTML docs, move scripts to varFreqs/, refs #36642 Add GREGoR R04 WGS track to varFreqs superTrack. Update Data Access and Methods sections for all 20+ subtrack HTML files with consistent formatting, sequencing methods from source papers, and links to makeDoc and Github scripts. Move all varFreqs conversion scripts into scripts/varFreqs/ subdirectory and update makeDoc paths accordingly. Co-Authored-By: Claude Opus 4.6 diff --git src/hg/makeDb/trackDb/human/hrc.html src/hg/makeDb/trackDb/human/hrc.html new file mode 100644 index 00000000000..1d42232fd08 --- /dev/null +++ src/hg/makeDb/trackDb/human/hrc.html @@ -0,0 +1,62 @@ +

Description

+

+The Haplotype Reference Consortium (HRC) is a collaboration among several +large sequencing projects to create a reference panel for genotype imputation. +Release 1.1 contains 64,976 haplotypes from 32,488 whole-genome sequenced samples at +low coverage (average 7x), with 40 million variant sites (minimum allele count of 5). +

+

+The contributing studies include the 1000 Genomes Project, UK10K, and many other cohorts. +Since 1000 Genomes data is already available as a separate track, this track shows only +the frequencies from the non-1000 Genomes samples (~30,000 individuals), resulting in +38.3 million variants after lifting from GRCh37 to GRCh38. +

+ +

Data Access

+

+The data can be explored interactively with the +Table Browser or the +Data Integrator. +For programmatic access, our REST API can be used; the +track name is hrc. +For bulk download, the VCF file can be obtained from +our download server. +

+

+The original site list file can also be downloaded from the +HRC website. +Our Github repo contains a +script that converts the tab-separated file to VCF and lifts it to hg38. +

+ +

Methods

+

+The HRC r1.1 site list was downloaded from the +HRC website +as a tab-separated file on GRCh37, converted to VCF and lifted to GRCh38 with UCSC liftOver. +Only frequencies from the non-1000 Genomes samples (~30,000 of the 32,488 total) are included, +since 1000 Genomes data is available separately. Of 40.4M input variants, 8,052 were unmapped +by liftOver and 2.1M were present only in 1000 Genomes samples and were dropped, leaving +38.3M variants. +We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. +For some tracks, python scripts were necessary and are also available from Github. +

+ +

Credits

+

+Thanks to the Haplotype Reference Consortium and all contributing studies for making this +reference panel publicly available. +

+ +

References

+

+McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C, +Danecek P, Sharp K et al. + +A reference panel of 64,976 haplotypes for genotype imputation. +Nat Genet. 2016 Oct;48(10):1279-83. +PMID: 27548312; PMC: PMC5388176 +