aa61ebc800429515f9ced7e28f669c6042219f43 max Wed Mar 18 09:09:13 2026 -0700 varFreqs supertrack: add GREGoR track, update all HTML docs, move scripts to varFreqs/, refs #36642 Add GREGoR R04 WGS track to varFreqs superTrack. Update Data Access and Methods sections for all 20+ subtrack HTML files with consistent formatting, sequencing methods from source papers, and links to makeDoc and Github scripts. Move all varFreqs conversion scripts into scripts/varFreqs/ subdirectory and update makeDoc paths accordingly. Co-Authored-By: Claude Opus 4.6 diff --git src/hg/makeDb/trackDb/human/mxbFreq.html src/hg/makeDb/trackDb/human/mxbFreq.html new file mode 100644 index 00000000000..26d93951cb2 --- /dev/null +++ src/hg/makeDb/trackDb/human/mxbFreq.html @@ -0,0 +1,78 @@ +

Description

+

+The Mexico Biobank (MXB) project +genotyped 6,011 individuals sampled across all 32 states of Mexico during the 2000 National +Health Survey (ENSA 2000) conducted by the National Institute of Public Health (INSP). +Genotyping was performed with the Illumina Multi-Ethnic Global Array (MEGA, ~1.8M SNPs), +optimized for admixed populations and enriched for ancestry-informative and medically relevant +variants. Only autosomal, biallelic SNPs passing quality control are included. Samples were +selected from 898 recruitment sites, with prioritization of indigenous language speakers. +

+ +

+This track shows allele frequencies computed from the phased genotypes. The full +phased genotype data with haplotype clustering display is available in the +Mexico Biobank track under Phased Variants. +Frequencies can also be plotted onto a map on the +MexVar platform. +The hg38 data was lifted from hg19 by UCSC (see below). +

+ +

Data Access

+

+We are not allowed to redistribute the VCF file. +Allele frequencies by geographical state and ancestry are available via +the MexVar platform. +Raw genotype data are available under controlled access at the +EGA (Study: EGAS00001005797; Dataset: EGAD00010002361). For the VCFs, email +andres.moreno@cinvestav.mx to obtain the data. +

+ +

Methods

+

+Data processing included GenomeStudio → PLINK conversion, strand alignment, removal +of duplicates, update of map positions using dbSNP Build 151 and low-quality +variants/individuals, and relatedness filtering. +At UCSC, the phased VCF was lifted from hg19 to hg38 with CrossMap, then allele counts +(AC, AF, AN) were computed using bcftools fill-tags and genotypes were stripped to produce +a sites-only frequency VCF. +

+ +

+We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. +For some tracks, python scripts were necessary and are also available from Github. +

+ +

Credits

+

+We thank the Center for Research and Advanced Studies (Cinvestav) of Mexico for +generating and providing the frequency data, the National Institute of Medical +Sciences and Nutrition (INCMNSZ) for DNA extraction, and the Ministry of Health +together with the National Institute of Public Health (INSP) for the design and +implementation of the National Health Survey 2000 (ENSA 2000). We also thank +the ENSA-Genomics Consortium for their contributions to sample collection and +data processing that made possible the construction of the MXB genomic resource. +

+ +

References

+

+Barberena-Jonas C, Medina-Muñoz SG, Cedillo-Castelán V, Sepúlveda-Morales T, +Gonzaga-Jáuregui C, ENSA Genomics Consortium, García-García L, Ioannidis AG, +Moreno-Estrada A. + +Clinical genetic variation across Hispanic populations in the Mexican Biobank. +Nat Med. 2026 Jan 21;. +DOI: 10.1038/s41591-025-04100-z; PMID: 41566040 +

+ +

+Sohail M, Palma-Martínez MJ, Chong AY, Quinto-Corés CD, Barberena-Jonas C, Medina-Muñoz SG, +Ragsdale A, Delgado-Sánchez G, Cruz-Hervert LP, Ferreyra-Reyes L et al. + +Mexican Biobank advances population and medical genomics of diverse ancestries. +Nature. 2023 Oct;622(7984):775-783. +PMID: 37821706; PMC: PMC10600006 +