9bfd58221b1539193cb7f0a317b4e959c1c7e49a
max
Thu May 21 01:00:45 2026 -0700
varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful <b> emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
diff --git src/hg/makeDb/trackDb/human/hrc.html src/hg/makeDb/trackDb/human/hrc.html
index 4dd48e8b58b..1ac7e103599 100644
--- src/hg/makeDb/trackDb/human/hrc.html
+++ src/hg/makeDb/trackDb/human/hrc.html
@@ -1,62 +1,62 @@
<h2>Description</h2>
<p>
The <a href="http://www.haplotype-reference-consortium.org/"
target="_blank">Haplotype Reference Consortium (HRC)</a> is a collaboration among several
large sequencing projects to create a reference panel for genotype imputation.
Release 1.1 contains 64,976 haplotypes from 32,488 whole-genome sequenced samples at
low coverage (average 7x), with 40 million variant sites (minimum allele count of 5).
</p>
<p>
The contributing studies include the 1000 Genomes Project, UK10K, and many other cohorts.
Since 1000 Genomes data is already available as a separate track, this track shows only
-the frequencies from the non-1000 Genomes samples (~30,000 individuals), resulting in
-38.3 million variants after lifting from GRCh37 to GRCh38.
+the frequencies from the non-1000 Genomes samples (~30,000 individuals). After the lift
+from GRCh37 to GRCh38, 38.3 million variants remain.
</p>
<h2>Data Access</h2>
<p>
The data can be explored interactively with the
<a href="../cgi-bin/hgTables">Table Browser</a> or the
<a href="../cgi-bin/hgIntegrator">Data Integrator</a>.
For programmatic access, our <a href="https://api.genome.ucsc.edu" target="_blank">REST API</a> can be used; the
track name is <em>hrc</em>.
For bulk download, the VCF file can be obtained from
<a href="http://hgdownload.soe.ucsc.edu/gbdb/hg38/varFreqs/" target="_blank">our download server</a>.
</p>
<p>
The original site list file can also be downloaded from the
<a href="http://www.haplotype-reference-consortium.org/site" target="_blank">HRC website</a>.
Our GitHub repo contains a
<a href="https://github.com/ucscGenomeBrowser/kent/tree/master/src/hg/makeDb/scripts/hrcToVcf.py"
target="_blank">script</a> that converts the tab-separated file to VCF and lifts it to hg38.
</p>
<h2>Methods</h2>
<p>
The HRC r1.1 site list was downloaded from the
<a href="http://www.haplotype-reference-consortium.org/site" target="_blank">HRC website</a>
as a tab-separated file on GRCh37, converted to VCF and lifted to GRCh38 with UCSC liftOver.
Only frequencies from the non-1000 Genomes samples (~30,000 of the 32,488 total) are included,
since 1000 Genomes data is available separately. Of 40.4M input variants, 8,052 were unmapped
by liftOver and 2.1M were present only in 1000 Genomes samples and were dropped, leaving
38.3M variants.
-We provide documentation that indicates how all source files of the varFreqs track were converted in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
-For some tracks, python scripts were necessary and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
+The conversion steps for all source files of the varFreqs track are documented in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
+Some tracks required python scripts, which are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
</p>
<h2>Credits</h2>
<p>
Thanks to the Haplotype Reference Consortium and all contributing studies for making this
reference panel publicly available.
</p>
<h2>References</h2>
<p>
McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C,
Danecek P, Sharp K <em>et al</em>.
<a href="https://doi.org/10.1038/ng.3643" target="_blank">
A reference panel of 64,976 haplotypes for genotype imputation</a>.
<em>Nat Genet</em>. 2016 Oct;48(10):1279-83.
PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/27548312" target="_blank">27548312</a>; PMC: <a
href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388176/" target="_blank">PMC5388176</a>
</p>