9bfd58221b1539193cb7f0a317b4e959c1c7e49a
max
Thu May 21 01:00:45 2026 -0700
varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful <b> emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
diff --git src/hg/makeDb/trackDb/human/swefreq.html src/hg/makeDb/trackDb/human/swefreq.html
index 1fdb2686bd9..b0129ffbc58 100644
--- src/hg/makeDb/trackDb/human/swefreq.html
+++ src/hg/makeDb/trackDb/human/swefreq.html
@@ -1,74 +1,74 @@
<h2>Description</h2>
<p>
<a href="https://swefreq.nbis.se/dataset/SweGen" target="_blank">SweGen</a> provides
whole-genome sequencing variant frequencies for 1,000 Swedish individuals.
The 1,000 individuals represent a cross-section of the Swedish population and no disease
information was used for the selection. The frequency data may therefore include genetic variants
that are associated with, or causative of, disease. SweGen also provides SV calls, TEs, MELT
results for TEs, HLAs and a FASTA file with new sequence not in hg38. There is
also a version for the T2T CHM13 assembly. The full dataset can be browsed at
the
<a href="https://swefreq.nbis.se/dataset/SweGen/browser" target="_blank">SweGen Browser</a>.
</p>
<p>
The mobile element insertions called by MELT on the same 1,000 SweGen
samples are loaded as a separate track,
<a href="hgTrackUi?g=meiSwegen">SweGen 1000 MEIs</a>, in the
<a href="hgTrackUi?g=mei">Mobile Element Insertions</a> collection.
</p>
<h2>Data Access</h2>
<p>
Due to license restrictions, the data for this track cannot be downloaded from the UCSC
Genome Browser. The Table Browser, Data Integrator, and download server are not available
for this track.
</p>
<p>
VCF files can be requested at
<a href="https://swefreq.nbis.se/dataset/SweGen" target="_blank">SweGen</a> via a form. The request
-needs manual approval, which usually is quick. If there is no reply, email SweGen directly.
+needs manual approval, which is usually quick. If there is no reply, email SweGen directly.
</p>
<h2>Methods</h2>
<p>
Fragment size 350bp on a Covaris E220. Paired-end sequencing with 150bp read length was performed
on Illumina HiSeq X (HiSeq Control Software 3.3.39/RTA 2.7.1) with v2.5 sequencing chemistry.
Raw whole-genome reads were aligned to the GRCh37 reference using BWA-MEM v0.7.12, then sorted and
indexed with samtools v0.1.19 and assessed with qualimap v2.2.20; per-sample alignments from
multiple lanes and flow cells were merged using Picard MergeSamFiles v1.120. Processing followed
GATK best practices with GATK v3.3, including indel realignment (RealignerTargetCreator,
IndelRealigner), duplicate marking (Picard MarkDuplicates v1.120), and base quality score
recalibration (BaseRecalibrator), producing one finalized BAM per sample. Per-sample gVCFs were
generated with GATK HaplotypeCaller v3.3 using reference files from the GATK v2.8 resource bundle,
with all steps coordinated via Piper v1.4.0. Joint genotyping of 1,000 samples was performed by
merging gVCFs in five batches of 200 using GATK CombineGVCFs, followed by cohort genotyping with
GATK GenotypeGVCFs and variant quality score recalibration for SNVs and indels using
VariantRecalibrator and ApplyRecalibration.
</p>
<p>
At UCSC, the hg38 VCF was downloaded from
<a href="https://swefreq.nbis.se/dataset/SweGen/download" target="_blank">SweFreq</a> and loaded as-is.
The file that we use is swegen_frequencies_fixploidy_GRCh38_20190204.vcf.gz.
-We provide documentation that indicates how all source files of the varFreqs track were converted in the <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a> of the track.
-For some tracks, python scripts were necessary and are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
+The conversion steps for all source files of the varFreqs track are documented in the track's <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/varFreqs.txt" target="_blank">makeDoc file</a>.
+For some tracks, python scripts were needed; these are also available from <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/scripts/varFreqs" target="_blank">GitHub</a>.
</p>
<h2>Credits</h2>
<p>
The SweGen allele frequency data was generated by Science for Life Laboratory.
Any redistributed data derived from the SweGen data set must follow the SweGen terms and conditions.
The data may not be used to attempt to identify any individual in this or other studies.
Thanks to the SweGen patients and SciLifeLab for making the data available.
</p>
<h2>References</h2>
<p>
Ameur A, Dahlberg J, Olason P, Vezzi F, Karlsson R, Martin M, Viklund J, Kähäri AK,
Lundin P, Che H <em>et al</em>.
<a href="https://doi.org/10.1038/ejhg.2017.130" target="_blank">
SweGen: a whole-genome data resource of genetic variability in a cross-section of the Swedish
population</a>.
<em>Eur J Hum Genet</em>. 2017 Nov;25(11):1253-1260.
PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/28832569" target="_blank">28832569</a>; PMC: <a
href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5765326/" target="_blank">PMC5765326</a>
</p>