676b58d841804f049f720cc9ba3fddec216dae61 max Tue Dec 2 06:22:46 2025 -0800 adding saudi arabia to variant frequencies track diff --git src/hg/makeDb/trackDb/human/varFreqs.html src/hg/makeDb/trackDb/human/varFreqs.html index 8dc145fd00a..f47b5d89e6b 100644 --- src/hg/makeDb/trackDb/human/varFreqs.html +++ src/hg/makeDb/trackDb/human/varFreqs.html @@ -114,30 +114,37 @@ 1,896 whole genome sequencing and 3,409 whole exome sequencing data from healthy individuals of Korean ethnicity. Most of the samples were originated from normal tissue of cancer patients (40.16 %), healthy parents of rare disease patients (28.4 %), or healthy volunteers (31.44 %). Japanese ancestry is broken down in the INFO field. Coverage 100x for WES, 30x for WGS. For details see (Lee et al, Exp Mol Med 2022).
Most tracks only show the variant and allele frequencies on mouseover or clicks. When zoomed in, tracks display alleles with base-specific coloring. Homozygote data are shown as one letter, while heterozygotes will be displayed with both letters.
For NCBI ALFA: This track has no single VCF with INFO fields, but uses multiple subtracks instead, one per ancestry.
@@ -224,30 +231,35 @@NPM Singapore: Whole Genome Sequencing (WGS) data processing followed GATK4 best practices. GATK4 germline variant analysis workflow written in WDL was adapted to use Nextflow and deployed at the National Supercomputing Centre, Singapore (NSCC). In short, WGS reads were aligned against GRCh38 using the BWA-MEM algorithm and used as input to GATK HaplotypeCaller to produce single sample gVCFs. The gVCF files were joint-called then loaded in Hail, an open-source python-based data analysis library suited to work with population-scale with genomic data collections. Low-quality WGS libraries and low-quality variants were removed. QC-ed variants were functionally annotated using Ensembl Variant Effect Predictor (VEP) (version 95). Functional annotations for variant impacting protein-coding were also complemented with information on the potential alteration to their cognate protein's 3D structure and drug binding ability.
+Saudi Genome Program: Data was downloaded +from Figshare, +and converted to VCF. +
+MXB: We thank the Center for Research and Advanced Studies (Cinvestav) of Mexico for generating and providing the frequency data, the National Institute of Medical Sciences and Nutrition (INCMNSZ) for DNA extraction, and the Ministry of Health together with the National Institute of Public Health (INSP) for the design and implementation of the National Health Survey 2000 (ENSA 2000). We also thank the ENSA-Genomics Consortium for their contributions to sample collection and data processing that made possible the construction of the MXB genomic resource.
MCPS: Data produced by Regeneron RGC and collaborators, which are the University of Oxford, Universidad Nacional Autónoma de México (UNAM) and National Institute of Genomic Medicine in Mexico. @@ -434,15 +446,26 @@ FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023 Jan;613(7944):508-518. PMID: 36653562; PMC: PMC9849126
Wong E, Bertin N, Hebrard M, Tirado-Magallanes R, Bellis C, Lim WK, Chua CY, Tong PML, Chua R, Mak K et al. The Singapore National Precision Medicine Strategy. Nat Genet. 2023 Feb;55(2):178-186. PMID: 36658435
+ + ++Malomane DK, Williams MP, Huber CD, Mangul S, Abedalthagafi M, Chiang CWK. + +Patterns of population structure and genetic variation within the Saudi Arabian population. +bioRxiv. 2025 Jan 13;. +PMID: 39868174; PMC: PMC11761371 +
+