9bfd58221b1539193cb7f0a317b4e959c1c7e49a
max
Thu May 21 01:00:45 2026 -0700
varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful <b> emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
diff --git src/hg/makeDb/trackDb/human/mxbFreq.html src/hg/makeDb/trackDb/human/mxbFreq.html
index 62cc6748103..7d31b1cad62 100644
--- src/hg/makeDb/trackDb/human/mxbFreq.html
+++ src/hg/makeDb/trackDb/human/mxbFreq.html
@@ -1,82 +1,82 @@
<h2>Description</h2>
<p>
The <a href="https://www.mxbiobank.org/" target="_blank">Mexico Biobank (MXB)</a> project
genotyped 6,011 individuals sampled across all 32 states of Mexico during the 2000 National
Health Survey (ENSA 2000) conducted by the National Institute of Public Health (INSP).
-Genotyping was performed with the Illumina Multi-Ethnic Global Array (MEGA, ~1.8M SNPs),
+Genotyping used the Illumina Multi-Ethnic Global Array (MEGA, ~1.8M SNPs), which is
optimized for admixed populations and enriched for ancestry-informative and medically relevant
-variants. Only autosomal, biallelic SNPs passing quality control are included. Samples were
-selected from 898 recruitment sites, with prioritization of indigenous language speakers.
+variants. Only autosomal, biallelic SNPs that passed quality control are included. Samples
+came from 898 recruitment sites, and indigenous language speakers were prioritized.
</p>
<p>
This track shows allele frequencies computed from the phased genotypes. The full
phased genotype data with haplotype clustering display is available in the
<a href="hgTrackUi?g=mexbb">Mexico Biobank track</a> under Phased Variants.
Frequencies can also be plotted onto a map on the
<a href="https://morenolab.shinyapps.io/mexvar/" target="_blank">MexVar platform</a>.
The hg38 data was lifted from hg19 by UCSC (see below).
</p>
<h2>Data Access</h2>
<p>
Due to license restrictions, the data for this track cannot be downloaded from the UCSC
Genome Browser. The Table Browser, Data Integrator, and download server are not available
for this track.
</p>
<p>
Allele frequencies by geographical state and ancestry are available via
the <a href="https://morenolab.shinyapps.io/mexvar/" target="_blank">MexVar platform</a>.
Raw genotype data are available under controlled access at the
EGA (Study: EGAS00001005797; Dataset: EGAD00010002361). For the VCFs, email
andres.moreno@cinvestav.mx to obtain the data.
</p>
<h2>Methods</h2>
<p>
Data processing included GenomeStudio → PLINK conversion, strand alignment, removal
of duplicates, update of map positions using dbSNP Build 151 and low-quality
variants/individuals, and relatedness filtering.
At UCSC, the phased VCF was lifted from hg19 to hg38 with CrossMap, then allele counts
(AC, AF, AN) were computed using bcftools fill-tags and genotypes were stripped to produce
a sites-only frequency VCF.
</p>
<p>
-We provide documentation that indicates how all source files of the varFreqs track were converted in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
-For some tracks, python scripts were necessary and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
+The <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> documents how the source files of the varFreqs track were converted.
+For some tracks, python scripts were needed and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
</p>
<h2>Credits</h2>
<p>
We thank the Center for Research and Advanced Studies (Cinvestav) of Mexico for
generating and providing the frequency data, the National Institute of Medical
Sciences and Nutrition (INCMNSZ) for DNA extraction, and the Ministry of Health
together with the National Institute of Public Health (INSP) for the design and
implementation of the National Health Survey 2000 (ENSA 2000). We also thank
the ENSA-Genomics Consortium for their contributions to sample collection and
data processing that made possible the construction of the MXB genomic resource.
</p>
<h2>References</h2>
<p>
Barberena-Jonas C, Medina-Muñoz SG, Cedillo-Castelán V, Sepúlveda-Morales T,
Gonzaga-Jáuregui C, ENSA Genomics Consortium, García-García L, Ioannidis AG,
Moreno-Estrada A.
<a href="https://doi.org/10.1038/s41591-025-04100-z" target="_blank">
Clinical genetic variation across Hispanic populations in the Mexican Biobank</a>.
<em>Nat Med</em>. 2026 Jan 21;.
DOI: <a href="https://doi.org/10.1038/s41591-025-04100-z"
target="_blank">10.1038/s41591-025-04100-z</a>; PMID: <a
href="https://www.ncbi.nlm.nih.gov/pubmed/41566040" target="_blank">41566040</a>
</p>
<p>
Sohail M, Palma-Martínez MJ, Chong AY, Quinto-Corés CD, Barberena-Jonas C, Medina-Muñoz SG,
Ragsdale A, Delgado-Sánchez G, Cruz-Hervert LP, Ferreyra-Reyes L <em>et al</em>.
<a href="https://doi.org/10.1038/s41586-023-06560-0" target="_blank">
Mexican Biobank advances population and medical genomics of diverse ancestries</a>.
<em>Nature</em>. 2023 Oct;622(7984):775-783.
PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37821706" target="_blank">37821706</a>; PMC: <a
href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10600006/" target="_blank">PMC10600006</a>
</p>