src/hg/makeDb/trackDb/human/topmed.html 38bafc856320cf5360e0482faeee72b78f2ea963

38bafc856320cf5360e0482faeee72b78f2ea963
lrnassar
  Tue May 5 14:13:30 2026 -0700
QA pass on varFreqs per-subtrack description pages: encode 3 plain emails, add target=_blank to 15 boilerplate REST API links, and add missing References sections (and Data Access on varFreqsAll). refs #36642

Mechanical fixes across 18 per-subtrack description pages:
- Encoded 3 plain author/contact emails: pfeliciano@simonsfoundation.org (sfariSparkExomes), m.hobbs@garvan.org.au (mgrb), contact_npco@a-star.edu.sg (npm).
- Added target="_blank" to 15 occurrences of the boilerplate "<a href=https://api.genome.ucsc.edu>REST API</a>" link across allofus, topmed, sfariSparkExomes, tommo60kjpn, alfaVcf, gasp, abraom, indigenomes, hrc, saudi, schema, sgdpFreq, gregor, hgdp1kFreq, colorsDbSnv.

Added missing References sections:
- allofus.html: All of Us Research Program 2024 Nature.
- topmed.html: Taliun 2021 Nature.
- alfaVcf.html: NCBI ALFA documentation citation (no peer-reviewed paper yet).
- gregor.html: GREGoR R04 Methods document + consortium website (no flagship publication yet).
- varFreqsAll.html: pointer to the supertrack's References section, plus tool citations (bcftools csq, Ensembl VEP).

Added missing Data Access section on varFreqsAll.html explaining that the merged callset is not downloadable due to mixed source-data licensing, but can be reconstructed from the per-subtrack VCFs using the conversion scripts on GitHub.

All 25 unique varFreqs description pages now have Description, Methods, Data Access, References. No non-ASCII characters and no inline event handlers across the set.

diff --git src/hg/makeDb/trackDb/human/topmed.html src/hg/makeDb/trackDb/human/topmed.html
index 68a1865ef8e..3b8048cbaf7 100644
--- src/hg/makeDb/trackDb/human/topmed.html
+++ src/hg/makeDb/trackDb/human/topmed.html
@@ -1,31 +1,31 @@
 <h2>Description</h2>
 <p>
 <a href="https://topmed.nhlbi.nih.gov/" target="_blank">NHLBI TOPMed</a> (Trans-Omics for Precision
 Medicine) is a program launched by the U.S. National Heart, Lung, and Blood Institute that
 integrates whole-genome sequencing with molecular, clinical, and environmental data from large,
 well-phenotyped cohorts. Its goal is to uncover the biological mechanisms underlying heart, lung,
 blood, and sleep disorders to advance precision medicine and improve population health. Freeze 10
 contains 868,581,653 variants from 150,899 whole genomes.
 </p>
 
 <h2>Data Access</h2>
 <p>
 The data can be explored interactively with the
 <a href="../cgi-bin/hgTables">Table Browser</a> or the
 <a href="../cgi-bin/hgIntegrator">Data Integrator</a>.
-For programmatic access, our <a href="https://api.genome.ucsc.edu">REST API</a> can be used; the
+For programmatic access, our <a href="https://api.genome.ucsc.edu" target="_blank">REST API</a> can be used; the
 track name is <em>topmed</em>.
 For bulk download, the VCF file can be obtained from
 <a href="http://hgdownload.soe.ucsc.edu/gbdb/hg38/varFreqs/" target="_blank">our download server</a>.
 </p>
 <p>
 VCFs with summarized allele frequencies are also available from
 the <a href="https://bravo.sph.umich.edu/" target="_blank">TOPMED BRAVO website</a>. They require a
 login. The VCFs were downloaded from
 <a href="https://bravo.sph.umich.edu/terms.html" target="_blank">BRAVO</a>.
 </p>
 
 <h2>Methods</h2>
 <p>
 TOPMed whole genome sequencing was performed at multiple NHLBI-funded sequencing centers
 using PCR-free library preparation with 150 bp paired-end reads on Illumina short-read
@@ -39,15 +39,26 @@
 samples; and (4) variant filtering using a Support Vector Machine (SVM) classifier
 (libsvm) trained on positive labels derived from HapMap 3.3 and 1000 Genomes Omni2.5
 array sites, and negative labels derived from Mendelian-inconsistent variants identified
 within the cohort's pedigree structure using <code>vt milk-filter</code>. Sample-level quality
 control included estimation of DNA contamination, genetic ancestry, and biological sex
 using <code>cramore cram-verify-bam</code> (verifyBamID2) and relative X/Y chromosomal depth. Full
 methods for TOPMed freeze 10 are available on the
 <a href="https://topmed.nhlbi.nih.gov/topmed-whole-genome-sequencing-methods-freeze-10"
    target="_blank">TOPMed WGS Methods page</a>.
 </p>
 
 <p>
 We provide documentation that indicates how all source files of the varFreqs track were converted in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
 For some tracks, python scripts were necessary and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
 </p>
+
+<h2>References</h2>
+<p>
+Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM,
+Kang HM <em>et al</em>.
+<a href="https://doi.org/10.1038/s41586-021-03205-y" target="_blank">
+Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program</a>.
+<em>Nature</em>. 2021 Feb;590(7845):290-299.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33568819" target="_blank">33568819</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875770/" target="_blank">PMC7875770</a>
+</p>