9bfd58221b1539193cb7f0a317b4e959c1c7e49a
max
Thu May 21 01:00:45 2026 -0700
varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful <b> emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
diff --git src/hg/makeDb/trackDb/human/hgdp1kFreq.html src/hg/makeDb/trackDb/human/hgdp1kFreq.html
index 1dd30adba3d..b6ccf1fa476 100644
--- src/hg/makeDb/trackDb/human/hgdp1kFreq.html
+++ src/hg/makeDb/trackDb/human/hgdp1kFreq.html
@@ -26,40 +26,40 @@
For bulk download, the VCF file can be obtained from
<a href="http://hgdownload.soe.ucsc.edu/gbdb/hg38/varFreqs/" target="_blank">our download server</a>.
</p>
<p>
The original VCFs with full genotypes can also be downloaded from
<a href="https://gnomad.broadinstitute.org/downloads#v3-hgdp-1kg"
target="_blank">gnomAD Downloads</a>.
</p>
<h2>Methods</h2>
<p>
The gnomAD project reprocessed 4,094 whole genomes from the 1000 Genomes Project and the Human
Genome Diversity Project (HGDP) through a unified pipeline. Sequencing was performed on Illumina
platforms at a mean coverage of 32–34x. Reads were aligned to GRCh38 (hs38DH reference with
decoy and HLA sequences) using BWA-MEM 0.7.15. Variant calling followed GATK best practices:
-per-sample calling with GATK 3.5 HaplotypeCaller followed by joint genotyping with GATK4 using
-the Hail VCF combiner for scalable merging. Allele-specific variant quality score recalibration
-(AS-VQSR) was applied for both SNPs and indels. Sample QC included contamination estimation
-(verifyBamID), sex concordance, relatedness filtering (PC-Relate), and population assignment
-using PCA against gnomAD reference panels. Per-population allele frequencies were computed for
-80 fine-grained populations as well as broad continental groupings.
+per-sample calls with GATK 3.5 HaplotypeCaller, then joint genotyping with GATK4 through
+the Hail VCF combiner, which scales the merge step. Allele-specific variant quality score recalibration
+(AS-VQSR) was applied for both SNPs and indels. Sample QC included contamination estimates
+(verifyBamID), sex concordance, relatedness filters (PC-Relate), and population assignment
+with PCA against gnomAD reference panels. Per-population allele frequencies were computed for
+80 fine-grained populations and for broad continental groups.
</p>
<p>
-We provide documentation that indicates how all source files of the varFreqs track were converted in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
-For some tracks, python scripts were necessary and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
+The <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> documents how all source files of the varFreqs track were converted.
+For some tracks, python scripts were also needed and are available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
</p>
<h2>Credits</h2>
<p>
Thanks to the gnomAD team at the Broad Institute for harmonizing and making this dataset
publicly available, and to all participants of the 1000 Genomes Project and the Human Genome
Diversity Project.
</p>
<h2>References</h2>
<p>
Koenig Z, Yohannes MT, Nkambule LL, Zhao X, Goodrich JK, Kim HA, Wilson MW, Tiao G, Hao SP, Sahakian
N <em>et al</em>.
<a href="https://pmc.ncbi.nlm.nih.gov/articles/pmid/38749656/" target="_blank">
A harmonized public resource of deeply sequenced diverse human genomes</a>.