9bfd58221b1539193cb7f0a317b4e959c1c7e49a
max
  Thu May 21 01:00:45 2026 -0700
varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful <b> emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

diff --git src/hg/makeDb/trackDb/human/topmed.html src/hg/makeDb/trackDb/human/topmed.html
index d67f8a75e61..cf4f6ebda7c 100644
--- src/hg/makeDb/trackDb/human/topmed.html
+++ src/hg/makeDb/trackDb/human/topmed.html
@@ -1,60 +1,60 @@
 <h2>Description</h2>
 <p>
 <a href="https://topmed.nhlbi.nih.gov/" target="_blank">NHLBI TOPMed</a> (Trans-Omics for Precision
 Medicine) is a program launched by the U.S. National Heart, Lung, and Blood Institute that
 integrates whole-genome sequencing with molecular, clinical, and environmental data from large,
 well-phenotyped cohorts. Its goal is to uncover the biological mechanisms underlying heart, lung,
 blood, and sleep disorders to advance precision medicine and improve population health. Freeze 10
 contains 868,581,653 variants from 150,899 whole genomes.
 </p>
 
 <h2>Data Access</h2>
 <p>
 Due to license restrictions, the data for this track cannot be downloaded from the UCSC
 Genome Browser. The Table Browser, Data Integrator, and download server are not available
 for this track.
 </p>
 <p>
 VCFs with summarized allele frequencies are available from
 the <a href="https://bravo.sph.umich.edu/" target="_blank">TOPMED BRAVO website</a>. They require a
 login. The VCFs were downloaded from
 <a href="https://bravo.sph.umich.edu/terms.html" target="_blank">BRAVO</a>.
 </p>
 
 <h2>Methods</h2>
 <p>
 TOPMed whole genome sequencing was performed at multiple NHLBI-funded sequencing centers
 using PCR-free library preparation with 150 bp paired-end reads on Illumina short-read
 platforms, targeting &ge;30x mean coverage. Reads were aligned to the GRCh38 reference genome
 (hs38DH, including decoy sequences) using BWA-MEM, followed by duplicate marking with
 Picard MarkDuplicates and base quality score recalibration (BQSR) with GATK. Variant calling
 was performed using the TOPMed GotCloud pipeline (developed at the Center for Statistical
 Genetics, University of Michigan), comprising: (1) per-sample candidate variant detection with
 <code>vt discover2</code> and normalization with <code>vt normalize</code>; (2) cross-sample variant site
 consolidation using <code>cramore vcf-merge-candidate-variants</code>; (3) joint genotyping across all
 samples; and (4) variant filtering using a Support Vector Machine (SVM) classifier
 (libsvm) trained on positive labels derived from HapMap 3.3 and 1000 Genomes Omni2.5
 array sites, and negative labels derived from Mendelian-inconsistent variants identified
 within the cohort's pedigree structure using <code>vt milk-filter</code>. Sample-level quality
 control included estimation of DNA contamination, genetic ancestry, and biological sex
 using <code>cramore cram-verify-bam</code> (verifyBamID2) and relative X/Y chromosomal depth. Full
 methods for TOPMed freeze 10 are available on the
 <a href="https://topmed.nhlbi.nih.gov/topmed-whole-genome-sequencing-methods-freeze-10"
    target="_blank">TOPMed WGS Methods page</a>.
 </p>
 
 <p>
-We provide documentation that indicates how all source files of the varFreqs track were converted in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
+Documentation on how all source files of the varFreqs track were converted is in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
 For some tracks, python scripts were necessary and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
 </p>
 
 <h2>References</h2>
 <p>
 Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM,
 Kang HM <em>et al</em>.
 <a href="https://doi.org/10.1038/s41586-021-03205-y" target="_blank">
 Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program</a>.
 <em>Nature</em>. 2021 Feb;590(7845):290-299.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33568819" target="_blank">33568819</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875770/" target="_blank">PMC7875770</a>
 </p>