38bafc856320cf5360e0482faeee72b78f2ea963 lrnassar Tue May 5 14:13:30 2026 -0700 QA pass on varFreqs per-subtrack description pages: encode 3 plain emails, add target=_blank to 15 boilerplate REST API links, and add missing References sections (and Data Access on varFreqsAll). refs #36642 Mechanical fixes across 18 per-subtrack description pages: - Encoded 3 plain author/contact emails: pfeliciano@simonsfoundation.org (sfariSparkExomes), m.hobbs@garvan.org.au (mgrb), contact_npco@a-star.edu.sg (npm). - Added target="_blank" to 15 occurrences of the boilerplate "REST API" link across allofus, topmed, sfariSparkExomes, tommo60kjpn, alfaVcf, gasp, abraom, indigenomes, hrc, saudi, schema, sgdpFreq, gregor, hgdp1kFreq, colorsDbSnv. Added missing References sections: - allofus.html: All of Us Research Program 2024 Nature. - topmed.html: Taliun 2021 Nature. - alfaVcf.html: NCBI ALFA documentation citation (no peer-reviewed paper yet). - gregor.html: GREGoR R04 Methods document + consortium website (no flagship publication yet). - varFreqsAll.html: pointer to the supertrack's References section, plus tool citations (bcftools csq, Ensembl VEP). Added missing Data Access section on varFreqsAll.html explaining that the merged callset is not downloadable due to mixed source-data licensing, but can be reconstructed from the per-subtrack VCFs using the conversion scripts on GitHub. All 25 unique varFreqs description pages now have Description, Methods, Data Access, References. No non-ASCII characters and no inline event handlers across the set. diff --git src/hg/makeDb/trackDb/human/hrc.html src/hg/makeDb/trackDb/human/hrc.html index 6a9ac4a0a50..4dd48e8b58b 100644 --- src/hg/makeDb/trackDb/human/hrc.html +++ src/hg/makeDb/trackDb/human/hrc.html @@ -1,62 +1,62 @@

Description

The Haplotype Reference Consortium (HRC) is a collaboration among several large sequencing projects to create a reference panel for genotype imputation. Release 1.1 contains 64,976 haplotypes from 32,488 whole-genome sequenced samples at low coverage (average 7x), with 40 million variant sites (minimum allele count of 5).

The contributing studies include the 1000 Genomes Project, UK10K, and many other cohorts. Since 1000 Genomes data is already available as a separate track, this track shows only the frequencies from the non-1000 Genomes samples (~30,000 individuals), resulting in 38.3 million variants after lifting from GRCh37 to GRCh38.

Data Access

The data can be explored interactively with the Table Browser or the Data Integrator. -For programmatic access, our REST API can be used; the +For programmatic access, our REST API can be used; the track name is hrc. For bulk download, the VCF file can be obtained from our download server.

The original site list file can also be downloaded from the HRC website. Our GitHub repo contains a script that converts the tab-separated file to VCF and lifts it to hg38.

Methods

The HRC r1.1 site list was downloaded from the HRC website as a tab-separated file on GRCh37, converted to VCF and lifted to GRCh38 with UCSC liftOver. Only frequencies from the non-1000 Genomes samples (~30,000 of the 32,488 total) are included, since 1000 Genomes data is available separately. Of 40.4M input variants, 8,052 were unmapped by liftOver and 2.1M were present only in 1000 Genomes samples and were dropped, leaving 38.3M variants. We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. For some tracks, python scripts were necessary and are also available from GitHub.

Credits

Thanks to the Haplotype Reference Consortium and all contributing studies for making this reference panel publicly available.

References

McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C, Danecek P, Sharp K et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016 Oct;48(10):1279-83. PMID: 27548312; PMC: PMC5388176