aa61ebc800429515f9ced7e28f669c6042219f43 max Wed Mar 18 09:09:13 2026 -0700 varFreqs supertrack: add GREGoR track, update all HTML docs, move scripts to varFreqs/, refs #36642 Add GREGoR R04 WGS track to varFreqs superTrack. Update Data Access and Methods sections for all 20+ subtrack HTML files with consistent formatting, sequencing methods from source papers, and links to makeDoc and Github scripts. Move all varFreqs conversion scripts into scripts/varFreqs/ subdirectory and update makeDoc paths accordingly. Co-Authored-By: Claude Opus 4.6 diff --git src/hg/makeDb/trackDb/human/npm.html src/hg/makeDb/trackDb/human/npm.html new file mode 100644 index 00000000000..276ff733c10 --- /dev/null +++ src/hg/makeDb/trackDb/human/npm.html @@ -0,0 +1,63 @@ +

Description

+

+The National Precision Medicine (NPM) program +in Singapore sequenced 9,770 whole genomes, mostly of Chinese, Indian and Malay ancestry. +A minimum allele count cutoff of >5 was applied. CNV data is also available. +

+ +

Data Access

+

+Due to license restrictions, the data for this track cannot be downloaded from the UCSC +Genome Browser. The Table Browser, Data Integrator, and download server are not available +for this track. +

+

+VCF download can be requested on the Chorus Browser website, which requires an +account and data access request. +

+ +

Methods

+

+Whole Genome Sequencing (WGS) data processing followed GATK4 best practices. GATK4 germline variant +analysis workflow written in WDL was adapted to use Nextflow and deployed at the National +Supercomputing Centre, Singapore (NSCC). WGS reads were aligned against GRCh38 using the BWA-MEM +algorithm and used as input to GATK HaplotypeCaller to produce single sample gVCFs. The gVCF files +were joint-called then loaded in Hail. Low-quality WGS libraries and low-quality variants were +removed. QC-ed variants were functionally annotated using Ensembl Variant Effect Predictor (VEP) +(version 95). Functional annotations for variants impacting protein-coding regions were also +complemented with information on the potential alteration to their cognate protein's 3D structure +and drug binding ability. +

+

+Our data access request was approved by the NPM data access committee. It can be contacted at contact_npco@a-star.edu.sg. +We downloaded the data from the NPM Chorus browser download section. +We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. +For some tracks, python scripts were necessary and are also available from Github. +

+ +

Credits

+

+Thanks to the NPM Data Access Committee and Eleanor for granting our data request. +By browsing the data, you agree to use the data only for academic, non-commercial +research to improve human health (biology/disease). We request all data users +agree to protect the confidentiality of the data subjects in any research papers or publications +that they may prepare, by taking all reasonable care to limit the possibility +of identification. In particular, the data users shall not use, or attempt +to use, the data to deliberately compromise or otherwise infringe the +confidentiality of information on data subjects and their right to privacy. +If you use any of the data obtained from the CHORUS variant browser, we request +that you cite the NPM flagship paper (Wong et al, 2023). All data users of the +data must take note that the data provider and relevant SG10K_Health cohort +owners bear no responsibility for the further analysis or interpretation of the data. +

+ +

References

+

+Wong E, Bertin N, Hebrard M, Tirado-Magallanes R, Bellis C, Lim WK, Chua CY, Tong PML, Chua R, Mak K +et al. + +The Singapore National Precision Medicine Strategy. +Nat Genet. 2023 Feb;55(2):178-186. +PMID: 36658435 +