aa61ebc800429515f9ced7e28f669c6042219f43 max Wed Mar 18 09:09:13 2026 -0700 varFreqs supertrack: add GREGoR track, update all HTML docs, move scripts to varFreqs/, refs #36642 Add GREGoR R04 WGS track to varFreqs superTrack. Update Data Access and Methods sections for all 20+ subtrack HTML files with consistent formatting, sequencing methods from source papers, and links to makeDoc and Github scripts. Move all varFreqs conversion scripts into scripts/varFreqs/ subdirectory and update makeDoc paths accordingly. Co-Authored-By: Claude Opus 4.6 diff --git src/hg/makeDb/trackDb/human/saudi.html src/hg/makeDb/trackDb/human/saudi.html new file mode 100644 index 00000000000..b35cd9993ba --- /dev/null +++ src/hg/makeDb/trackDb/human/saudi.html @@ -0,0 +1,51 @@ +

Description

+

+Variant frequencies from 302 whole genomes at 30x coverage from the +Saudi Genome Program. The genotyping data and imputations from 3,352 +individuals do not seem to be available publicly. +

+ +

Data Access

+

+The data can be explored interactively with the +Table Browser or the +Data Integrator. +For programmatic access, our REST API can be used; the +track name is saudi. +For bulk download, the VCF file can be obtained from +our download server. +

+

+The original data were downloaded from +Figshare and converted to VCF. +

+ +

Methods

+

+Whole-genome sequencing of 302 Saudi Arabian individuals was performed on the Illumina HiSeq +X Ten platform using TruSeq Nano DNA library preparation at 30x target coverage. Sequencing and +initial bioinformatics processing were carried out by deCODE Genetics (Reykjavík, Iceland). +Reads were aligned to the GRCh38 reference genome using BWA 0.7.10. Per-sample variant calling +was performed with GATK HaplotypeCaller, followed by joint genotyping using CombineGVCFs and +GenotypeGVCFs. Variant quality score recalibration (VQSR) was applied for both SNPs and indels. +The final autosomal callset contains 25.5 million variants across the 302 individuals. +

+

+The variant data were downloaded from +Figshare and converted to VCF format using a custom script. +We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. +For some tracks, python scripts were necessary and are also available from Github. +

+ +

References

+

+Malomane DK, Williams MP, Huber CD, Mangul S, Abedalthagafi M, Chiang CWK. + +Patterns of population structure and genetic variation within the Saudi Arabian population. +bioRxiv. 2025 Jan 13;. +PMID: 39868174; PMC: PMC11761371 +