src/hg/makeDb/trackDb/human/varFreqs.html 85a3ec13e80a0e61f16e691afb878956e0483892

85a3ec13e80a0e61f16e691afb878956e0483892
max
  Fri Nov 28 08:53:18 2025 -0800
adding Finnland to var freqs track, refs #36642

diff --git src/hg/makeDb/trackDb/human/varFreqs.html src/hg/makeDb/trackDb/human/varFreqs.html
index 053f7ef112c..8dc145fd00a 100644
--- src/hg/makeDb/trackDb/human/varFreqs.html
+++ src/hg/makeDb/trackDb/human/varFreqs.html
@@ -1,428 +1,448 @@
 <h2>Description</h2>
 <p>
 This container shows results from projects
 where the variant frequencies, aka allele frequencies, are publicly available. The tracks were collected from the 
 projects listed below. Projects that provide haplotype-phased genotypes/variants can be found
 elsewhere: 1000 Genomes is a separate track, and the projects HGDP, SGDP,
 HGDP+1000 Genomes and Mexico Biobank can be found in the "Phased Variants" track.
 </p>
 <p>If you want us to add other projects, please contact us. We asked and were
 unable to obtain variant frequencies from the following projects: UK Biobank (request pending), All of us (granted),
 SFARI SPARK (in process).
 </p>
 
 <p>
 The following projects were added:
 <ul>
     <li>
         <b><a href="https://rgc-mcps.regeneron.com/home"
         target="_blank">Mexico City Prospective Study (MCPS)</a></b>:
         9,950 whole genome sequenced individuals and 141,046 exome sequenced and genotyped
         individuals from the Mexico City Prospective Study (MCPS), a collaboration between the
         Regeneron Genetics Center, University of Oxford, Universidad Nacional Aut&oacute;noma de
         M&eacute;xico (UNAM), National Institute of Genomic Medicine in Mexico, Abbvie Inc. and
         AstraZeneca UK. For details see (Ziyatdinov A, Nature 2023), the reference section.
     </li>
 
     <li>
         <b><a href="https://rgc-research.regeneron.com/me/home"
         target="_blank">Regeneron Million Exomes Project (ME)</a></b>:
         Whole-exomes of 983,578 individuals sequenced by the Regeneron Genetics Center (RGC).
         These data span dozens of collaborations including large biobanks and
         health systems. All data were generated by the RGC on a single, harmonized
         sequencing and informatics protocol. The dataset includes individuals across
         diverse ancestral populations, encompassing outbred and founder populations and
         cohorts with high rates of consanguinity. See (Sun et al, Nature 2024) for details.
     </li>
 
     <li>
         <b><a href="https://topmed.nhlbi.nih.gov/" target="_blank">NHLBI TOPMED Freeze 10</a></b>:
         NHLBI TOPMed (Trans-Omics for Precision
         Medicine) program, launched by the U.S. National Heart, Lung, and Blood
         Institute, integrates whole-genome sequencing with molecular, clinical,
         and environmental data from large, well-phenotyped cohorts. Its goal is to
         uncover the biological mechanisms underlying heart, lung, blood, and sleep
         disorders to advance precision medicine and improve population health. Freeze
         10 contains 868,581,653 variants from 150,899 whole genomes. VCFs were
         downloaded from <a href="https://bravo.sph.umich.edu/terms.html"
         target="_blank">BRAVO</a>.
     </li>
 
     <li>
         <b><a href="https://www.genomeasia100k.org/"
         target="_blank">GenomeAsia Pilot (GAsP)</a></b>:
         Whole-genome sequencing data of 1,739 individuals from 219 population groups across Asia.
         See (GenomeAsia Consortium, Nature 2019) for details.
     </li>
 
     <li>
         <b><a href="https://www.ncbi.nlm.nih.gov/snp/docs/gsr/alfa/" target="_blank">ALFA</a></b>:
         The NCBI ALlele Frequency Aggregator pipeline computes allele frequencies from
         approved, unrestricted dbGaP studies and makes them publicly available through
         dbSNP. Its goal is to release frequency data from over one million dbGaP
         subjects to aid discoveries involving common and rare variants with biological
         or disease relevance. The R4 release includes 408,709 subjects and allele
         frequencies for 15.5 million rs sites, including nearly one million ClinVar
         variants. Genotype and associated individual-level data are accessible through dbGaP
         <a href="https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login"
         target="_blank">authorized access</a>.
     </li>
 
     <li>
-        <b><a href="https://jmorp.megabank.tohoku.ac.jp/downloads"
-        target="_blank">JPN To61k Japan Tohoku University Tohoku Medical Megabank Organization
-        61k Allele frequency panel (JPN 61k)</a></b>:
+        <b><a href="https://www.finngen.fi/en" target="_blank">FinnGen</a></b>:
+        Imputed variants from 500,348 Biobank samples obtained using genotyping arrays
+        in Finnland, 10% of the population. The imputation used phased variants obtained from 8,554
+        high-quality whole genome sequences, also from Finnland. For details, see (Kurki et al, Nature 2023).
+        Phenotype links can be shown at <a href="https://r12.finngen.fi/">FinnGen PheWeb</a>.
+    </li>
+    <li>
+        <b><a href="https://jmorp.megabank.tohoku.ac.jp/downloads" target="_blank">JPN To61k Japan Tohoku University Tohoku Medical Megabank Organization 61k Allele frequency panel (JPN 61k)</a></b>:
         An allele frequency panel based on short-read WGS analysis of 61,000 Japanese individuals.
         The project includes other datatypes, such as STRs, long-read SVs and short-read CNVs.
         Data can be downloaded from the <a href="https://jmorp.megabank.tohoku.ac.jp"
         target="_blank">jMorp Website</a>, specifically the
         <a href="https://jmorp.megabank.tohoku.ac.jp/downloads" target="_blank">Downloads</a>
         section. For details, see (Tadaka et al, NAR 2023).
     </li>
 
     <li>
         <b><a href="https://abraom.ib.usp.br/"
         target="_blank">Brazil Arquivo Brasileiro Online de Muta&ccedil;&otilde; (ABraOM)</a></b>:
         Genomic variants obtained with whole-genome sequencing from SABE, a
         census-based sample of elderly individuals from S&atilde;o Paulo, Brazil's
         largest city. Brazilian population is constituted by ~500 years of
         admixture between Africans, Europeans, and Native Americans.
         Additionally, the cohort presents ~3% of individuals with non-admixed
         Japanese ancestry (early 20th century migration). Coverage 38.6.  Data
         can be downloaded from the <a href="https://abraom.ib.usp.br/download/"
         target="_blank">AbraOM Website</a>. For details see (Naslavsky et al, Nat Comm 2022).
     </li>
 
     <li>
         <b><a href="https://clingen.igib.res.in/indigen/" target="_blank">IndiGenomes</a></b>:
         Whole genome sequencing of 1,029 healthy Indian individuals under the pilot phase of the
         &quot;IndiGen&quot; program.
         Data can be downloaded from the <a href="https://clingen.igib.res.in/indigen/"
         target="_blank">IndiGen Website</a>. For details see (Jain et al, NAR 2020). Only
         the allele frequency is available from this project. The website also provides SV call
         and Alu insertion VCFs.
     </li>
 
     <li>
         <b><a href="https://www.kobic.re.kr/kova/"
         target="_blank">Korean Variant Archive (KOVA)</a></b>:
         1,896 whole genome sequencing and 3,409 whole exome sequencing data from healthy individuals
         of Korean ethnicity.
         Most of the samples were originated from normal tissue of cancer
         patients (40.16 %), healthy parents of rare disease patients (28.4 %),
         or healthy volunteers (31.44 %). Japanese ancestry is broken down
-        in the INFO field.
-        TSV data can be requested on the <a href="https://www.kobic.re.kr/kova/downloads"
-        target="_blank">KOVA Downloads</a> website. Coverage 100x for WES, 30x for WGS.
-        For details see (Lee et al, Exp Mol Med 2022).
-    </li>
+        in the INFO field. Coverage 100x for WES, 30x for WGS.
+        For details see (Lee et al, Exp Mol Med 2022).</li>
     <li>
-        <b><a href=""
+        <b><a href="https://www.npm.sg/"
         target="_blank">NPM Singapore</a></b>:
         9,770 whole genomes, mostly of Chinese, Indian and Malay ancestry. 
-        VCF access can be requested on the <a href="https://chorus.grids-platform.io/"
-        target="_blank">Chorus Browser</a> website, which requires an account and access request. 
-        For details see (Wong et al, Nat Genetics 2023).
+        A minimum allele count cutoff of &gt; 5 was applied.
+        Data is available for download from the CHORUS browser, see "Data access" below.
+        For details see (Wong et al, Nat Genetics 2023). CNV data is also available there.
     </li>
 </ul>
 </p>
 
 <h2>Display Conventions</h2>
 
 <p>Most tracks only show the variant and allele frequencies on mouseover or clicks.
 When zoomed in, tracks display alleles with base-specific coloring. Homozygote
 data are shown as one letter, while heterozygotes will be displayed with both
 letters.
 </p>
 
 <p>
 For <b>NCBI ALFA:</b> This track has no single VCF with INFO fields, but uses multiple subtracks
 instead, one per ancestry.
 </p>
 
 
 <h2>Data Access</h2>
 <p>Most of the data in these tracks are not available for download from UCSC.
 Data can be browsed on our website.
 But the data can be downloaded
 for free from the original projects. Accessing the 
 data usually requires a click-through license on the respectice websites, links are either
 provided above in the project description or with more details here:
 </p>
 
 <p>
 <b>MXB:</b> Allele frequencies by geographical state and ancestry are available via
 the <a target="_blank" href="https://morenolab.shinyapps.io/mexvar/">MexVar platform</a>.
 Raw genotype data are available under controlled access at the
 EGA (Study: EGAS00001005797; Dataset: EGAD00010002361). For the VCFs, email
 andres.moreno@cinvestav.mx.
 </p>
 <p>
 <b>MCPS:</b> VCFs with summarized allele frequencies are available from
 the <a target="_blank" href="https://rgc-mcps.regeneron.com/">MCPS website</a>.
 </p>
 <p>
 <b>Regeneron one million exomes:</b> VCFs with summarized allele frequencies are available from
 the <a target="_blank" href="https://rgc-research.regeneron.com/me/resources">RGC ME website</a>.
 </p>
 <p>
 <b>TOPMED:</b> VCFs with summarized allele frequencies are available from
 the <a target="_blank" href="https://bravo.sph.umich.edu/">TOPMED BRAVO website</a>. They require a
 login.
 </p>
 <p>
 <b>GenomeAsia Pilot:</b> VCFs are available from UCSC and also from
 the <a target="_blank"
 href="https://browser.genomeasia100k.org/#tid=download">GenomeAsia 100K website</a>.
 No license nor login.
 </p>
 
+<p><b>KOVA:</b> 
+        TSV data can be requested on the <a href="https://www.kobic.re.kr/kova/downloads"
+        target="_blank">KOVA Downloads</a> website. 
+</p>
+
+<p><b>Finngen:</b> TSV data can be requested via the form at https://finngen.gitbook.io/documentation/data-download which triggers an email with the download link.</p>
+
+<p><b>NPM:</b> 
+        VCF access can be requested on the 
+        <a href="https://chorus.grids-platform.io/" target="_blank">Chorus Browser</a> website, which requires an 
+        <a href = "https://npm.a-star.edu.sg/" target=_blank>account and data access request</a>. 
+</p>
+
 <h2>Methods</h2>
 <p>
 <b>MXB:</b> Genotyping was performed with the Illumina Multi-Ethnic Global Array
 (MEGA, ~1.8M SNPs), optimized for admixed populations and enriched for
 ancestry-informative and medically relevant variants. Only autosomal, biallelic
 SNPs passing quality control are included. Samples were selected from 898
 recruitment sites, with prioritization of indigenous language speakers. Data
 processing included GenomeStudio &rarr; PLINK conversion, strand alignment, removal
 of duplicates, update of map positions using dbSNP Build 151 and low-quality
 variants/individuals, and relatedness filtering.
 </p>
 <p>
 <b>SGDP:</b> The version used was
 <a target="_blank" href="https://sharehost.hms.harvard.edu/genetics/reich_lab/sgdp/vcf_variants/"
 >https://sharehost.hms.harvard.edu/genetics/reich_lab/sgdp/vcf_variants/</a>,
 merged with bcftools and lifted to hg38 with CrossMap. 
 </p>
 <p>
 <b>KOVA:</b> V7 of the TSV.gz was obtained from the KOVA staff and converted to VCF. It is not
 available for download from our site but can be requested from the KOVA website.
 </p>
 
-<p><b>Finngen:</b> R12 was downloaded from https://finngen.gitbook.io/documentation/data-download and converted to VCF with a Python script. </p>
+<p><b>Finngen:</b> R12 annotated variants were downloaded from the Google Cloud
+bucket link received though an email after filling out the form linked from
+https://finngen.gitbook.io/documentation/data-download and converted to VCF
+with a <a
+href="https://github.com/ucscGenomeBrowser/kent/tree/master/src/hg/makeDb/scripts/finngen_to_vcf.py"
+target=_blank>custom Python script</a>. </p>
 
 <p><b>NPM Singapore:</b> Whole Genome Sequencing (WGS) data processing followed
 GATK4 best practices. GATK4 germline variant analysis workflow written in WDL
 was adapted to use Nextflow and deployed at the National Supercomputing Centre,
 Singapore (NSCC). In short, WGS reads were aligned against GRCh38 using the
 BWA-MEM algorithm and used as input to GATK HaplotypeCaller to produce single
 sample gVCFs. The gVCF files were joint-called then loaded in Hail, an
 open-source python-based data analysis library suited to work with
 population-scale with genomic data collections. Low-quality WGS libraries and
 low-quality variants were removed.  QC-ed variants were functionally annotated
 using Ensembl Variant Effect Predictor (VEP) (version 95). Functional
 annotations for variant impacting protein-coding were also complemented with
 information on the potential alteration to their cognate protein's 3D structure
 and drug binding ability.
 </p>
 
 <h2>Credits</h2>
 <p>
 <b>MXB:</b> We thank the Center for Research and Advanced Studies (Cinvestav) of Mexico for
 generating and providing the frequency data, the National Institute of Medical
 Sciences and Nutrition (INCMNSZ) for DNA extraction, and the Ministry of Health
 together with the National Institute of Public Health (INSP) for the design and
 implementation of the National Health Survey 2000 (ENSA 2000). We also thank
 the ENSA-Genomics Consortium for their contributions to sample collection and
 data processing that made possible the construction of the MXB genomic
 resource.
 </p>
 <p>
 <b>MCPS:</b> Data produced by Regeneron RGC and collaborators, which are the
 University of Oxford, Universidad Nacional Aut&oacute;noma de M&eacute;xico (UNAM) and
 National Institute of Genomic Medicine in Mexico.
 The Regeneron Genetics Center, University of Oxford, Universidad Nacional
 Aut&oacute;noma de M&eacute;xico (UNAM), National Institute of Genomic Medicine in Mexico,
 Abbvie Inc. and AstraZeneca UK Limited (collectively, the &quot;Collaborators&quot;) bear
 no responsibility for the analyses or interpretations of the data presented
 here. Any opinions, insights, or conclusions presented herein are those of the
 authors and not of the Collaborators. </p>
 </p>
 <p>
 <b>Regeneron Million Exomes:</b> The Regeneron Genetics Center, and its collaborators
 (collectively, the &quot;Collaborators&quot;) bear no responsibility for the analyses or
 interpretations of the data presented here. Any opinions, insights, or
 conclusions presented herein are those of the authors and not of the
 Collaborators. This research has been conducted using the UK Biobank Resource
 under application number 26041.
 </p>
 <p>
 <b>SGDP:</b> This project was funded by the Simons Foundation. Thanks to David Reich and Swapan 
 Mallick for help with importing the data.
 </p>
 <p>
 <b>KOVA:</b> Thanks to Insu Jang and the KOVA director for providing variant frequencies in TSV
 format.
 </p>
 <p>
 <b>Finngen:</b> We want to acknowledge the participants and investigators of the FinnGen study.
 </p>
 
 <p>
 <b>NPM Singapore:</b> Thanks to the NPM Data Access Committee and Eleanor for granting our data request. 
 By browsing the data, you agree to use the data only for academic, non-commercial
 research to improve human health (biology/disease).  We request all data users
 agree to protect the
 confidentiality of the data subjects in any research papers or publications
 that they may prepare, by taking all reasonable care to limit the possibility
 of identification. In particular, the data users shall not to use, or attempt
 to use, the data to deliberately compromise or otherwise infringe the
 confidentiality of information on data subjects and their right to privacy.
 If you use any of the data obtained from the CHORUS variant browser, we request
 that you cite the NPM flagship paper (Wong et al, 2023). All data users of the
 data must take note that the data provider and relevant SG10K_Health cohort
 owners bear no responsibility for the further analysis or interpretation of the
 data.  </p>
 
 <p>Thanks to Alex Ioannidis, UCSC, and Andreas Lahner, MGZ, for feedback on this track.</p>
 
 <h2>References</h2>
 <p>
 Barberena-Jonas, C. et al. (2025). MexVar database: Clinical genetic variation beyond the
 Hispanic label in the Mexican Biobank. <em>Nature Medicine (in press)</em>.
 </p>
 
 <p>
 Sohail M, Moreno-Estrada A.
 <a href="https://journals.biologists.com/dmm/article-lookup/doi/10.1242/dmm.050522" target="_blank">
 The Mexican Biobank Project promotes genetic discovery, inclusive science and local capacity
 building</a>.
 <em>Dis Model Mech</em>. 2024 Jan 1;17(1).
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/38299665" target="_blank">38299665</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10855211/" target="_blank">PMC10855211</a>
 </p>
 
 <p>
 Sohail M, Palma-Mart&iacute;nez MJ, Chong AY, Quinto-Cor&eacute;s CD, Barberena-Jonas C, Medina-Mu&ntilde;oz SG,
 Ragsdale A, Delgado-S&aacute;nchez G, Cruz-Hervert LP, Ferreyra-Reyes L <em>et al</em>.
 <a href="https://doi.org/10.1038/s41586-023-06560-0" target="_blank">
 Mexican Biobank advances population and medical genomics of diverse ancestries</a>.
 <em>Nature</em>. 2023 Oct;622(7984):775-783.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37821706" target="_blank">37821706</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10600006/" target="_blank">PMC10600006</a>
 </p>
 
 <p>
 Ziyatdinov A, Torres J, Alegre-D&iacute;az J, Backman J, Mbatchou J, Turner M, Gaynor SM, Joseph T, Zou Y,
 Liu D <em>et al</em>.
 <a href="https://doi.org/10.1038/s41586-023-06595-3" target="_blank">
 Genotyping, sequencing and analysis of 140,000 adults from Mexico City</a>.
 <em>Nature</em>. 2023 Oct;622(7984):784-793.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37821707" target="_blank">37821707</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10600010/" target="_blank">PMC10600010</a>
 </p>
 
 <p>
 GenomeAsia100K Consortium.
 <a href="https://doi.org/10.1038/s41586-019-1793-z" target="_blank">
 The GenomeAsia 100K Project enables genetic discoveries across Asia</a>.
 <em>Nature</em>. 2019 Dec;576(7785):106-111.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/31802016" target="_blank">31802016</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7054211/" target="_blank">PMC7054211</a>
 </p>
 
 <p>
 Sun KY, Bai X, Chen S, Bao S, Zhang C, Kapoor M, Backman J, Joseph T, Maxwell E, Mitra G <em>et
 al</em>.
 <a href="https://doi.org/10.1038/s41586-024-07556-0" target="_blank">
 A deep catalogue of protein-coding variation in 983,578 individuals</a>.
 <em>Nature</em>. 2024 Jul;631(8021):583-592.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/38768635" target="_blank">38768635</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11254753/" target="_blank">PMC11254753</a>
 </p>
 
 <p>
 Tadaka S, Kawashima J, Hishinuma E, Saito S, Okamura Y, Otsuki A, Kojima K, Komaki S, Aoki Y, Kanno
 T <em>et al</em>.
 <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkad978" target="_blank">
 jMorp: Japanese Multi-Omics Reference Panel update report 2023</a>.
 <em>Nucleic Acids Res</em>. 2024 Jan 5;52(D1):D622-D632.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/37930845" target="_blank">37930845</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10767895/" target="_blank">PMC10767895</a>
 </p>
 
 
 
 <p>
 Naslavsky MS, Scliar MO, Yamamoto GL, Wang JYT, Zverinova S, Karp T, Nunes K, Ceroni JRM, de
 Carvalho DL, da Silva Sim&otilde;es CE <em>et al</em>.
 <a href="https://doi.org/10.1038/s41467-022-28648-3" target="_blank">
 Whole-genome sequencing of 1,171 elderly admixed individuals from S&atilde;o Paulo, Brazil</a>.
 <em>Nat Commun</em>. 2022 Mar 4;13(1):1004.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/35246524" target="_blank">35246524</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8897431/" target="_blank">PMC8897431</a>
 </p>
 
 
 
 <p>
 Jain A, Bhoyar RC, Pandhare K, Mishra A, Sharma D, Imran M, Senthivel V, Divakar MK, Rophina M,
 Jolly B <em>et al</em>.
 <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkaa923" target="_blank">
 IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes</a>.
 <em>Nucleic Acids Res</em>. 2021 Jan 8;49(D1):D1225-D1232.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33095885" target="_blank">33095885</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7778947/" target="_blank">PMC7778947</a>
 </p>
 
 
 
 <p>
 Bergstr&ouml;m A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J
 <em>et al</em>.
 <a href="https:///www.science.org/doi/10.1126/science.aay5012" target="_blank">
 Insights into human genetic variation and population history from 929 diverse genomes</a>.
 <em>Science</em>. 2020 Mar 20;367(6484).
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/32193295" target="_blank">32193295</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7115999/" target="_blank">PMC7115999</a>
 </p>
 
 <p>
 Koenig Z, Yohannes MT, Nkambule LL, Zhao X, Goodrich JK, Kim HA, Wilson MW, Tiao G, Hao SP, Sahakian
 N <em>et al</em>.
 <a href="https://pmc.ncbi.nlm.nih.gov/articles/pmid/38749656/" target="_blank">
 A harmonized public resource of deeply sequenced diverse human genomes</a>.
 <em>Genome Res</em>. 2024 Jun 25;34(5):796-809.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/38749656" target="_blank">38749656</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11216312/" target="_blank">PMC11216312</a>
 </p>
 
 <p>
 Mallick S, Li H, Lipson M, Mathieson I, Gymrek M, Racimo F, Zhao M, Chennagiri N, Nordenfelt S,
 Tandon A <em>et al</em>.
 <a href="https://doi.org/10.1038/nature18964" target="_blank">
 The Simons Genome Diversity Project: 300 genomes from 142 diverse populations</a>.
 <em>Nature</em>. 2016 Oct 13;538(7624):201-206.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/27654912" target="_blank">27654912</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5161557/" target="_blank">PMC5161557</a>
 </p>
 
 <p>
 Lee J, Lee J, Jeon S, Lee J, Jang I, Yang JO, Park S, Lee B, Choi J, Choi BO <em>et al</em>.
 <a href="https://doi.org/10.1038/s12276-022-00871-4" target="_blank">
 A database of 5305 healthy Korean individuals reveals genetic and clinical implications for an East
 Asian population</a>.
 <em>Exp Mol Med</em>. 2022 Nov;54(11):1862-1871.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/36323850" target="_blank">36323850</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9628380/" target="_blank">PMC9628380</a>
 </p>
 
 <p>
 Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, Reeve MP, Laivuori H,
 Aavikko M, Kaunisto MA <em>et al</em>.
 <a href="https://doi.org/10.1038/s41586-022-05473-8" target="_blank">
 FinnGen provides genetic insights from a well-phenotyped isolated population</a>.
 <em>Nature</em>. 2023 Jan;613(7944):508-518.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/36653562" target="_blank">36653562</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9849126/" target="_blank">PMC9849126</a>
 </p>
 
 <p>
 Wong E, Bertin N, Hebrard M, Tirado-Magallanes R, Bellis C, Lim WK, Chua CY, Tong PML, Chua R, Mak K
 <em>et al</em>.
 <a href="https://doi.org/10.1038/s41588-022-01274-x" target="_blank">
 The Singapore National Precision Medicine Strategy</a>.
 <em>Nat Genet</em>. 2023 Feb;55(2):178-186.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/36658435" target="_blank">36658435</a>
 </p>