38bafc856320cf5360e0482faeee72b78f2ea963 lrnassar Tue May 5 14:13:30 2026 -0700 QA pass on varFreqs per-subtrack description pages: encode 3 plain emails, add target=_blank to 15 boilerplate REST API links, and add missing References sections (and Data Access on varFreqsAll). refs #36642 Mechanical fixes across 18 per-subtrack description pages: - Encoded 3 plain author/contact emails: pfeliciano@simonsfoundation.org (sfariSparkExomes), m.hobbs@garvan.org.au (mgrb), contact_npco@a-star.edu.sg (npm). - Added target="_blank" to 15 occurrences of the boilerplate "REST API" link across allofus, topmed, sfariSparkExomes, tommo60kjpn, alfaVcf, gasp, abraom, indigenomes, hrc, saudi, schema, sgdpFreq, gregor, hgdp1kFreq, colorsDbSnv. Added missing References sections: - allofus.html: All of Us Research Program 2024 Nature. - topmed.html: Taliun 2021 Nature. - alfaVcf.html: NCBI ALFA documentation citation (no peer-reviewed paper yet). - gregor.html: GREGoR R04 Methods document + consortium website (no flagship publication yet). - varFreqsAll.html: pointer to the supertrack's References section, plus tool citations (bcftools csq, Ensembl VEP). Added missing Data Access section on varFreqsAll.html explaining that the merged callset is not downloadable due to mixed source-data licensing, but can be reconstructed from the per-subtrack VCFs using the conversion scripts on GitHub. All 25 unique varFreqs description pages now have Description, Methods, Data Access, References. No non-ASCII characters and no inline event handlers across the set. diff --git src/hg/makeDb/trackDb/human/npm.html src/hg/makeDb/trackDb/human/npm.html index d404811c7c5..26e9850021b 100644 --- src/hg/makeDb/trackDb/human/npm.html +++ src/hg/makeDb/trackDb/human/npm.html @@ -1,63 +1,63 @@
The National Precision Medicine (NPM) program in Singapore sequenced 9,770 whole genomes, mostly of Chinese, Indian and Malay ancestry. A minimum allele count cutoff of >5 was applied. CNV data is also available.
Due to license restrictions, the data for this track cannot be downloaded from the UCSC Genome Browser. The Table Browser, Data Integrator, and download server are not available for this track.
VCF download can be requested on the Chorus Browser website, which requires an account and data access request.
Whole Genome Sequencing (WGS) data processing followed GATK4 best practices. GATK4 germline variant analysis workflow written in WDL was adapted to use Nextflow and deployed at the National Supercomputing Centre, Singapore (NSCC). WGS reads were aligned against GRCh38 using the BWA-MEM algorithm and used as input to GATK HaplotypeCaller to produce single sample gVCFs. The gVCF files were joint-called then loaded in Hail. Low-quality WGS libraries and low-quality variants were removed. QC-ed variants were functionally annotated using Ensembl Variant Effect Predictor (VEP) (version 95). Functional annotations for variants impacting protein-coding regions were also complemented with information on the potential alteration to their cognate protein's 3D structure and drug binding ability.
-Our data access request was approved by the NPM data access committee. It can be contacted at contact_npco@a-star.edu.sg. +Our data access request was approved by the NPM data access committee. It can be contacted at contact_npco@a-star.edu.sg. We downloaded the data from the NPM Chorus browser download section. We provide documentation that indicates how all source files of the varFreqs track were converted in the makeDoc file of the track. For some tracks, python scripts were necessary and are also available from GitHub.
Thanks to the NPM Data Access Committee and Eleanor for granting our data request. By browsing the data, you agree to use the data only for academic, non-commercial research to improve human health (biology/disease). We request all data users agree to protect the confidentiality of the data subjects in any research papers or publications that they may prepare, by taking all reasonable care to limit the possibility of identification. In particular, the data users shall not use, or attempt to use, the data to deliberately compromise or otherwise infringe the confidentiality of information on data subjects and their right to privacy. If you use any of the data obtained from the CHORUS variant browser, we request that you cite the NPM flagship paper (Wong et al, 2023). All data users of the data must take note that the data provider and relevant SG10K_Health cohort owners bear no responsibility for the further analysis or interpretation of the data.
Wong E, Bertin N, Hebrard M, Tirado-Magallanes R, Bellis C, Lim WK, Chua CY, Tong PML, Chua R, Mak K et al. The Singapore National Precision Medicine Strategy. Nat Genet. 2023 Feb;55(2):178-186. PMID: 36658435