f7758c83e2610580346c95832cb4624aa9fae0fd jnavarr5 Mon Nov 10 17:18:01 2025 -0800 Making each list item less than 100 character. Fixing special characters using HTML entiites. Adding a missing URL. Removing duplicate entires, refs #36642 diff --git src/hg/makeDb/trackDb/human/varFreqs.html src/hg/makeDb/trackDb/human/varFreqs.html index 857e558cf49..8442fcdd6a1 100644 --- src/hg/makeDb/trackDb/human/varFreqs.html +++ src/hg/makeDb/trackDb/human/varFreqs.html @@ -1,184 +1,186 @@ <h2>Description</h2> <p> This container track contains annotation tracks with individual level genotypes, usually phased, and tracks where only the variant frequencies, aka allele frequencies, are shown. The tracks were collected from the following projects. Only the projects 1000 Genomes (its own track), HGDP, SGDP, HGDP+1k and MXB provide individual-level genotypes. All others provide only allele frequencies, their genotypes require signing a data access agreement. </p> <ul> <li> - <b><a href="https://www.mxbiobank.org/" target=_blank>Mexico Biobank (MXB)</a></b>: This track displays - phased alleles from the Mexico Biobank Project - (MXB), based on array genotyping of 6,011 individuals sampled across all 32 states of Mexico during - the 2000 National Health Survey (ENSA 2000) conducted by the National Institute of Public Health (INSP). - Frequencies can be plotted onto a map on <a href="https://morenolab.shinyapps.io/mexvar/" target=_blank>MexVar</a>. + <b><a href="https://www.mxbiobank.org/" target="_blank">Mexico Biobank (MXB)</a></b>: + This track displays phased alleles from the Mexico Biobank Project (MXB), based on array + genotyping of 6,011 individuals sampled across all 32 states of Mexico during the 2000 + National Health Survey (ENSA 2000) conducted by the National Institute of Public Health + (INSP). Frequencies can be plotted onto a map on + <a href="https://morenolab.shinyapps.io/mexvar/" target="_blank">MexVar</a>. The hg38 track was lifted from hg19. (Publication?) </li> - <li><b><a href="https://www.simonsfoundation.org/simons-genome-diversity-project/" - target="_blank">Simons Genome Diversity Project (SGDP):</a></b> + <li> + <b><a href="https://www.simonsfoundation.org/simons-genome-diversity-project/" + target="_blank">Simons Genome Diversity Project (SGDP)</a></b>: Funded by the Simons Foundation, the Simons Genome Diversity Project is a large-scale effort that sequenced high-coverage genomes from 300 individuals (279 in this track) representing 142 diverse and often indigenous populations worldwide. Its goal was to capture the full range of human genetic diversity to better understand population history, migration, and adaptation. It is sampling populations in a way that represents as much anthropological, linguistic and cultural diversity as possible, and thus includes many deeply divergent human populations that are not well represented in other datasets. SGDP emphasizes breadth of global representation and population history, whereas HGDP emphasizes continuity and comparability across major population groups. Not all iits data is public, so this track contains only 279 genomes. For details, see (Mallick et al, Nature 2016). The hg38 track was lifted from hg19. </li> - <li><b><a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC7115999/" target=_blank></a>Human Genome Diversity Project (HGDP)</b>: + <li> + <b><a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC7115999/" + target="_blank"></a>Human Genome Diversity Project (HGDP)</b>: 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. The Human Genome Diversity Project (HGDP) was launched in the early 1990s to study the genetic variation and evolutionary history of modern humans across global populations. Its goal was to document the full spectrum of human genetic diversity, particularly in indigenous and geographically isolated groups, to better understand population structure, migration, adaptation, and disease susceptibility.The project collected samples from ~1,000 individuals representing over 50 populations worldwide, including groups from Africa, Europe, Asia, Oceania, and the Americas. These data have become a foundational reference for population genetics and human evolution studies. - Data can be downloaded from the <a href="https://ngs.sanger.ac.uk/production/hgdp/hgdp_wgs.20190516/" target=_blank>Sanger Website</a>. For details, see (Bergström et al, Science 2020). + Data can be downloaded from the + <a href="https://ngs.sanger.ac.uk/production/hgdp/hgdp_wgs.20190516/" + target="_blank">Sanger Website</a>. For details, see (Bergström et al, Science 2020). </li> - <li><b><a href="https://gnomad.broadinstitute.org/news/2021-10-gnomad-v3-1-2-minor-release/" target=_blank>gnomAD HGDP and 1000 Genomes callset:</a></b> + <li> + <b><a href="https://gnomad.broadinstitute.org/news/2021-10-gnomad-v3-1-2-minor-release/" + target="_blank">gnomAD HGDP and 1000 Genomes callset</a></b>: A reprocessed version by the gnomAD project for the 1000 Genomes and Human Genome Diversity Project (HGDP) data, with 4094 genomes from 80 populations. We already have separate, older tracks for 1000 Genomes on the main hg38 - browser and for HGDP, just above. This - track combines both datasets, with harmonized data quality. For details, see (Koenig et al, 2024). + browser and for HGDP, just above. This track combines both datasets, with harmonized data + quality. For details, see (Koenig et al, 2024). </li> <li> - <b><a href="https://rgc-mcps.regeneron.com/home" target=_blank>Mexico City Prospective Study (MCPS)</a></b>: - 9,950 whole genome sequenced individuals - and 141,046 exome sequenced and genotyped individuals from the Mexico - City Prospective Study (MCPS), a collaboration between the Regeneron Genetics - Center, University of Oxford, Universidad Nacional Autónoma de México (UNAM), - National Institute of Genomic Medicine in Mexico, Abbvie Inc. and AstraZeneca - UK. For details see (Ziyatdinov A, Nature 2023), the reference section. + <b><a href="https://rgc-mcps.regeneron.com/home" + target="_blank">Mexico City Prospective Study (MCPS)</a></b>: + 9,950 whole genome sequenced individuals and 141,046 exome sequenced and genotyped + individuals from the Mexico City Prospective Study (MCPS), a collaboration between the + Regeneron Genetics Center, University of Oxford, Universidad Nacional Autónoma de + México (UNAM), National Institute of Genomic Medicine in Mexico, Abbvie Inc. and + AstraZeneca UK. For details see (Ziyatdinov A, Nature 2023), the reference section. </li> - <li> - <b><a href="https://rgc-mcps.regeneron.com/home" target=_blank>Mexico City Prospective Study (MCPS)</a></b>: - 9,950 whole genome sequenced individuals - and 141,046 exome sequenced and genotyped individuals from the Mexico - City Prospective Study (MCPS), a collaboration between the Regeneron Genetics - Center, University of Oxford, Universidad Nacional Autónoma de México (UNAM), - National Institute of Genomic Medicine in Mexico, Abbvie Inc. and AstraZeneca - UK. For details see (Ziyatdinov A, Nature 2023), the reference section. - </li> <li> <b><a href="https://rgc-research.regeneron.com/me/home" - target=_blank>Regeneron Million Exomes Project (ME)</a></b>: - Whole-exomes of - 983,578 individuals sequenced by the Regeneron Genetics Center (RGC). + target="_blank">Regeneron Million Exomes Project (ME)</a></b>: + Whole-exomes of 983,578 individuals sequenced by the Regeneron Genetics Center (RGC). These data span dozens of collaborations including large biobanks and health systems. All data were generated by the RGC on a single, harmonized sequencing and informatics protocol. The dataset includes individuals across diverse ancestral populations, encompassing outbred and founder populations and -cohorts with high rates of consanguinity. See (Sun et al, Nature 2024) for details. </li> + cohorts with high rates of consanguinity. See (Sun et al, Nature 2024) for details. + </li> + <li> - <b><a href="https://topmed.nhlbi.nih.gov/" target=_blank>NHLBI TOPMED - Freeze 10</a></b>: NHLBI TOPMed (Trans-Omics for Precision + <b><a href="https://topmed.nhlbi.nih.gov/" target="_blank">NHLBI TOPMED Freeze 10</a></b>: + NHLBI TOPMed (Trans-Omics for Precision Medicine) program, launched by the U.S. National Heart, Lung, and Blood Institute, integrates whole-genome sequencing with molecular, clinical, and environmental data from large, well-phenotyped cohorts. Its goal is to uncover the biological mechanisms underlying heart, lung, blood, and sleep disorders to advance precision medicine and improve population health. Freeze 10 contains 868,581,653 variants from 150,899 whole genomes. VCFs were downloaded from <a href="https://bravo.sph.umich.edu/terms.html" - target=_blank>BRAVO</a>. </li> - - <li><b><a href="https://www.genomeasia100k.org/" target=_blank>GenomeAsia - Pilot (GAsP)</a></b>: Whole-genome sequencing data of - 1,739 individuals from 219 population groups across Asia. See - (GenomeAsia Consortium, Nature 2019) for details. </li> - <li><b><a href="https://www.genomeasia100k.org/" target=_blank>GenomeAsia - Pilot (GAsP)</a></b>: Whole-genome sequencing data of - 1,739 individuals from 219 population groups across Asia. See - (GenomeAsia Consortium, Nature 2019) for details. </li> - - <li><b><a href="https://www.ncbi.nlm.nih.gov/snp/docs/gsr/alfa/" target=_blank>ALFA</a></b>: + target="_blank">BRAVO</a>. + </li> + + <li> + <b><a href="https://www.genomeasia100k.org/" + target="_blank">GenomeAsia Pilot (GAsP)</a></b>: + Whole-genome sequencing data of 1,739 individuals from 219 population groups across Asia. + See (GenomeAsia Consortium, Nature 2019) for details. + </li> + + <li> + <b><a href="https://www.ncbi.nlm.nih.gov/snp/docs/gsr/alfa/" target="_blank">ALFA</a></b>: The NCBI ALlele Frequency Aggregator pipeline computes allele frequencies from approved, unrestricted dbGaP studies and makes them publicly available through dbSNP. Its goal is to release frequency data from over one million dbGaP subjects to aid discoveries involving common and rare variants with biological or disease relevance. The R4 release includes 408,709 subjects and allele frequencies for 15.5 million rs sites, including nearly one million ClinVar -variants. Genotype and associated -individual-level data are accessible through dbGaP <a href="https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login">authorized access</a>. + variants. Genotype and associated individual-level data are accessible through dbGaP + <a href="https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login" + target="_blank">authorized access</a>. </li> - <li><b><a href="https://jmorp.megabank.tohoku.ac.jp/downloads" target=_blank>JPN To61k Japan Tohoku University Tohoku Medical Megabank Organization 61k Allele frequency panel (JPN 61k) </a></b>: + <li> + <b><a href="https://jmorp.megabank.tohoku.ac.jp/downloads" + target="_blank">JPN To61k Japan Tohoku University Tohoku Medical Megabank Organization + 61k Allele frequency panel (JPN 61k)</a></b>: An allele frequency panel based on short-read WGS analysis of 61,000 Japanese individuals. The project includes other datatypes, such as STRs, long-read SVs and short-read CNVs. - Data can be downloaded from the <a href="https://jmorp.megabank.tohoku.ac.jp" target=_blank>jMorp Website</a>, specifically the <a href="https://jmorp.megabank.tohoku.ac.jp/downloads">Downloads</a> section. - For details, see (Tadaka et al, NAR 2023). + Data can be downloaded from the <a href="https://jmorp.megabank.tohoku.ac.jp" + target="_blank">jMorp Website</a>, specifically the + <a href="https://jmorp.megabank.tohoku.ac.jp/downloads" target="_blank">Downloads</a> + section. For details, see (Tadaka et al, NAR 2023). </li> - <li><b><a href="https://abraom.ib.usp.br/" target=_blank>Brazil Arquivo Brasileiro Online de Mutações (ABraOM)</a></b>: + <li> + <b><a href="https://abraom.ib.usp.br/" + target="_blank">Brazil Arquivo Brasileiro Online de Mutaçõ (ABraOM)</a></b>: Genomic variants obtained with whole-genome sequencing from SABE, a - census-based sample of elderly individuals from São Paulo, Brazil's + census-based sample of elderly individuals from São Paulo, Brazil's largest city. Brazilian population is constituted by ~500 years of admixture between Africans, Europeans, and Native Americans. Additionally, the cohort presents ~3% of individuals with non-admixed Japanese ancestry (early 20th century migration). Coverage 38.6. Data can be downloaded from the <a href="https://abraom.ib.usp.br/download/" - target=_blank>AbraOM Website</a>. For details see (Naslavsky et al, Nat - Comm 2022). - </li> - - <li><b><a href="https://abraom.ib.usp.br/" target=_blank>Brazil Arquivo Brasileiro Online de Mutações (ABraOM)</a></b>: - Genomic variants obtained with whole-genome sequencing from SABE, a - census-based sample of 1,117 elderly individuals from São Paulo, Brazil's - largest city. The Brazilian population is constituted by ~500 years of - admixture between Africans, Europeans, and Native Americans. - Additionally, the cohort presents ~3% of individuals with non-admixed - Japanese ancestry (early 20th century migration). Coverage 38.6. Data - can be downloaded from the <a href="https://abraom.ib.usp.br/download/" - target=_blank>AbraOM Website</a>. For details see (Naslavsky et al, Nat Comm 2022). + target="_blank">AbraOM Website</a>. For details see (Naslavsky et al, Nat Comm 2022). </li> - <li><b><a href="https://clingen.igib.res.in/indigen/" target=_blank>IndiGenomes</a></b>: - Whole genome sequencing of 1,029 healthy Indian individuals under the pilot phase of the 'IndiGen' program. + <li> + <b><a href="https://clingen.igib.res.in/indigen/" target="_blank">IndiGenomes</a></b>: + Whole genome sequencing of 1,029 healthy Indian individuals under the pilot phase of the + "IndiGen" program. Data can be downloaded from the <a href="https://clingen.igib.res.in/indigen/" - target=_blank>IndiGen Website</a>. For details see (Jain et al, NAR 2020). Only + target="_blank">IndiGen Website</a>. For details see (Jain et al, NAR 2020). Only the allele frequency is available from this project. The website also provides SV call and Alu insertion VCFs. </li> - <li><b><a href="https://www.kobic.re.kr/kova/" target=_blank>Korean Variant Archive (KOVA)</a></b>: - 1,896 whole genome sequencing and 3,409 whole exome sequencing data from healthy individuals of Korean ethnicity. + <li> + <b><a href="https://www.kobic.re.kr/kova/" + target="_blank">Korean Variant Archive (KOVA)</a></b>: + 1,896 whole genome sequencing and 3,409 whole exome sequencing data from healthy individuals + of Korean ethnicity. Most of the samples were originated from normal tissue of cancer patients (40.16 %), healthy parents of rare disease patients (28.4 %), or healthy volunteers (31.44 %). Japanese ancestry is broken down in the INFO field. - TSV data can be requested on the <a target=_blank>KOVA Downloads</a> website. - Coverage 100x for WES, 30x for WGS. - For details see (Lee et al, Exp Mol Med 2022).</li> + TSV data can be requested on the <a href="https://www.kobic.re.kr/kova/downloads" + target="_blank">KOVA Downloads</a> website. Coverage 100x for WES, 30x for WGS. + For details see (Lee et al, Exp Mol Med 2022). + </li> </ul> <h2>Display Conventions</h2> <p>Most tracks only show the variant and allele frequencies on mouseover or clicks. When zoomed in, tracks display alleles with base-specific coloring. Homozygote data are shown as one letter, while heterozygotes will be displayed with both letters. </p> <p> Full haplotype display - only for the MXB and HGDP tracks: In "pack" mode, this track sorts the haplotypes. This can be useful for determining the similarity between the samples and inferring inheritance at a particular locus. For a full description of how the display works, please see our @@ -366,31 +368,31 @@ <p> Jain A, Bhoyar RC, Pandhare K, Mishra A, Sharma D, Imran M, Senthivel V, Divakar MK, Rophina M, Jolly B <em>et al</em>. <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkaa923" target="_blank"> IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes</a>. <em>Nucleic Acids Res</em>. 2021 Jan 8;49(D1):D1225-D1232. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33095885" target="_blank">33095885</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7778947/" target="_blank">PMC7778947</a> </p> <p> -Bergström A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J +Bergström A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J <em>et al</em>. <a href="https:///www.science.org/doi/10.1126/science.aay5012" target="_blank"> Insights into human genetic variation and population history from 929 diverse genomes</a>. <em>Science</em>. 2020 Mar 20;367(6484). PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/32193295" target="_blank">32193295</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7115999/" target="_blank">PMC7115999</a> </p> <p> Koenig Z, Yohannes MT, Nkambule LL, Zhao X, Goodrich JK, Kim HA, Wilson MW, Tiao G, Hao SP, Sahakian N <em>et al</em>. <a href="https://pmc.ncbi.nlm.nih.gov/articles/pmid/38749656/" target="_blank"> A harmonized public resource of deeply sequenced diverse human genomes</a>. <em>Genome Res</em>. 2024 Jun 25;34(5):796-809. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/38749656" target="_blank">38749656</a>; PMC: <a