9bfd58221b1539193cb7f0a317b4e959c1c7e49a max Thu May 21 01:00:45 2026 -0700 varFreqs: AI generated text sounds bad, hard to read, so remove typical AI language. "humanizer" pass on all 31 varFreqs description pages — cut em dashes, copula avoidance ("serves as", "stands as"), "-ing" puffery, and boilerplate filler ("We provide documentation that indicates how..."). Title-case headings and meaningful emphasis preserved. No facts/URLs/counts/versions changed. tpmi.html added as a new file (was previously uncommitted). refs #36642 Co-Authored-By: Claude Sonnet 4.6 diff --git src/hg/makeDb/trackDb/human/indigenomes.html src/hg/makeDb/trackDb/human/indigenomes.html index 156cfc412b9..9f6738ea8a0 100644 --- src/hg/makeDb/trackDb/human/indigenomes.html +++ src/hg/makeDb/trackDb/human/indigenomes.html @@ -1,28 +1,28 @@
Description

IndiGenomes provides whole genome sequencing data of 1,029 healthy Indian individuals under the pilot phase of the "IndiGen" program. The IndiGenomes website also provides SV call and Alu insertion VCFs.

The deployed VCF shown in this track is the public release subset distributed by the IndiGenomes project (18,016,257 records). The full Jain 2021 callset reports 55.8 million variants from the 1,029-genome cohort; the public release is a curated subset of those sites. The deployed VCF is sites-only and carries a per-variant VRT (variant type) INFO field. Per-variant allele counts and allele frequencies are not distributed with the -public release and therefore are not shown in this track. +public release and are not shown in this track.

Data Access

The data can be explored interactively with the Table Browser or the Data Integrator. For programmatic access, our REST API can be used; the track name is indigenomes. For bulk download, the VCF file can be obtained from our download server.

The original data can also be downloaded from the IndiGen website. @@ -38,29 +38,29 @@ mean coverage. Alignment to the GRCh38 reference genome, post-processing, and default quality-filtered variant calling were performed end-to-end on the Illumina DRAGEN v3.4 Bio-IT platform, which uses field-programmable gate array (FPGA) logic for high-throughput processing. The full Jain 2021 callset comprises 55,898,122 single-allelic genetic variants (SNVs and indels), of which 32.23% were unique to the Indian samples and absent from global reference databases. Variants were annotated using ANNOVAR with RefGene, and allele frequencies were cross-referenced against gnomAD v3, 1000 Genomes, ExAC, ESP6500, and the Greater Middle East Variome Project. The IndiGenomes database distributes a public-release subset of these variants (18,016,257 records); that subset is the file used in this track. (Jain, Bhoyar, Scaria, Sivasubbu & the IndiGen Consortium, Nucleic Acids Research 2021).

-We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. -For some tracks, python scripts were necessary and are also available from GitHub. +The makeDoc file of the track documents how all source files of the varFreqs track were converted. +For some tracks, python scripts were also used and are available from GitHub.

References

Jain A, Bhoyar RC, Pandhare K, Mishra A, Sharma D, Imran M, Senthivel V, Divakar MK, Rophina M, Jolly B et al. IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes. Nucleic Acids Res. 2021 Jan 8;49(D1):D1225-D1232. PMID: 33095885; PMC: PMC7778947