7fd4df639618f431e81244f1c0ad911e4dcb0bd5
max
  Wed Jan 28 04:56:31 2026 -0800
adding Australia Biobank and fixing up NCBI Alfa docs, refs #36642

diff --git src/hg/makeDb/trackDb/human/varFreqs.html src/hg/makeDb/trackDb/human/varFreqs.html
index 39d2482b47a..8a267679c99 100644
--- src/hg/makeDb/trackDb/human/varFreqs.html
+++ src/hg/makeDb/trackDb/human/varFreqs.html
@@ -55,41 +55,53 @@
         were sequenced, a total of 142,357 individuals with whole-exome (WES)
         and 12,519 with whole-genome sequencing (WGS).  The data contains
         32,559 trios and 8,895 quads (one sibling without autism), and 824
         twins. The same frequencies shown here
         are also available publicly on the <a href="https://genomes.sfari.org/" target=_blank>SFARI Genome Browser</a>. 
        See (SPARK et al, Neuron 2018) for details or the methods below on this page.
     </li>
 
     <li>
         <b><a href="https://www.genomeasia100k.org/"
         target="_blank">GenomeAsia Pilot (GAsP)</a></b>:
         Whole-genome sequencing data of 1,739 individuals from 219 population groups across Asia.
         See (GenomeAsia Consortium, Nature 2019) for details.
     </li>
 
+    <li>
+        <b><a href="https://www.genomeasia100k.org/"
+        target="_blank">Australia MRGB</a></b>:
+        The Australian Medical Genome Reference Bank collected
+        whole-genome sequencing data of 4,011 healthy elderly individuals, to make sure 
+        that the dataset is depleted of damaging genetic variants.
+        Age and sex summary graphs are available from 
+        <a href="https://sgc.garvan.org.au/initiatives/mgrb/index.html">the MGRB website</a>.
+        See (Lacaze Eur J Humn Genet 2019) for details.
+    </li>
+
     <li>
         <b><a href="https://www.ncbi.nlm.nih.gov/snp/docs/gsr/alfa/" target="_blank">ALFA</a></b>:
         The NCBI ALlele Frequency Aggregator pipeline computes allele frequencies from
         approved, unrestricted dbGaP studies and makes them publicly available through
         dbSNP. Its goal is to release frequency data from over one million dbGaP
         subjects to aid discoveries involving common and rare variants with biological
         or disease relevance. The R4 release includes 408,709 subjects and allele
         frequencies for 15.5 million rs sites, including nearly one million ClinVar
-        variants. Genotype and associated individual-level data are accessible through dbGaP
+        variants. We converted the NCBI track hub to VCF format, the data is freely available.
+        Genotype and associated individual-level data are accessible through the dbGaP
         <a href="https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=login"
-        target="_blank">authorized access</a>.
+        target="_blank">authorized access request</a> system.
     </li>
 
     <li>
         <b><a href="https://www.finngen.fi/en" target="_blank">FinnGen</a></b>:
         Imputed variants from 500,348 Biobank samples obtained using genotyping arrays
         in Finnland, 10% of the population. The imputation used phased variants obtained from 8,554
         high-quality whole genome sequences, also from Finnland. For details, see (Kurki et al, Nature 2023).
         Phenotype links can be shown at <a href="https://r12.finngen.fi/">FinnGen PheWeb</a>.
     </li>
 
     <li>
         <b><a href="https://swefreq.nbis.se/dataset/SweGen" target="_blank">SweGen</a></b>:
         Whole-genome sequencing variant frequencies for 1000 Swedish individuals generated within the SweGen project.
         The 1000 individuals included in the SweGen project represent a
         cross-section of the Swedish population and that no disease information
@@ -155,74 +167,74 @@
     <li>
         <b><a href="https://www.vision2030.gov.sa/en/explore/projects/the-saudi-genome-program"
         target="_blank">Saudi Genome Program</a></b>:
         Variant frequencies from 302 whole genomes at 30x coverage, on Saudi Genome Program Samples.
         The genotyping data and imputations from 3,352 individuals do not seem to be available publicly.
         For details see (Malomane et al 2025). 
     </li>
 </ul>
 </p>
 
 <h2>Display Conventions</h2>
 
 <p>Most tracks only show the variant and allele frequencies on mouseover or clicks.
 When zoomed in, tracks display alleles with base-specific coloring. Homozygote
 data are shown as one letter, while heterozygotes will be displayed with both
-letters.
-</p>
-
-<p>
-For <b>NCBI ALFA:</b> This track has no single VCF with INFO fields, but uses multiple subtracks
-instead, one per ancestry.
+letters. All VCF files are normalized, with one single allele per annotation (no multi-allele lines).
 </p>
 
 
 <h2>Data Access</h2>
 <p>Most of the data in these tracks are not available for download from UCSC.
 Data can be browsed on our website.
 But the data can be downloaded for free from the original projects. Accessing the 
-data usually requires a click-through license or access request on the respective websites, links are either
-provided above in the project description or with more details here:
+data usually requires a click-through license or filling out an access request form on the respective websites, links are either provided above in the project description or with more details here:
 </p>
 
 <p>
 <b>MXB:</b> Allele frequencies by geographical state and ancestry are available via
 the <a target="_blank" href="https://morenolab.shinyapps.io/mexvar/">MexVar platform</a>.
 Raw genotype data are available under controlled access at the
 EGA (Study: EGAS00001005797; Dataset: EGAD00010002361). For the VCFs, email
 andres.moreno@cinvestav.mx.
 </p>
 <p>
 <b>MCPS:</b> VCFs with summarized allele frequencies are available from
 the <a target="_blank" href="https://rgc-mcps.regeneron.com/">MCPS website</a>.
 </p>
 <p>
 <b>Regeneron one million exomes:</b> VCFs with summarized allele frequencies are available from
 the <a target="_blank" href="https://rgc-research.regeneron.com/me/resources">RGC ME website</a>.
 </p>
 <p>
 <b>TOPMED:</b> VCFs with summarized allele frequencies are available from
 the <a target="_blank" href="https://bravo.sph.umich.edu/">TOPMED BRAVO website</a>. They require a
 login.
 </p>
 <p>
 <b>SFARI SPARK:</b> Allele frequencies can be displayed on the
         <a href="https://genomes.sfari.org/" target=_blank>SFARI Genome Browser</a>.
         Full CRAMs and VCFs with genotypes are available from <a target="_blank" href="https://base.sfari.org/">SFARI Base</a>. 
 They require a data access request, which is usually reviewed quickly. More information is available in the 
 <a href="https://cohorts-cdn.simonsfoundation.org/spark/researcher_packets/SPARK_SFARI_Researcher_Welcome_Packet.pdf" target=_blank>SPARK Welcome Packet</a>.
 </p>
+
+<p>
+<b>Australia MGRB:</b> VCF access can be requested via a form from 
+<a target="_blank" href="https://sgc.garvan.org.au/terms/mgrb/index.html">Sydney Genomics</a>.
+</p>
+
 <p>
 <b>GenomeAsia Pilot:</b> VCFs are available from UCSC and also from
 the <a target="_blank"
 href="https://browser.genomeasia100k.org/#tid=download">GenomeAsia 100K website</a>.
 No license nor login.
 </p>
 
 <p><b>KOVA:</b> 
         TSV data can be requested on the <a href="https://www.kobic.re.kr/kova/downloads"
         target="_blank">KOVA Downloads</a> website. 
 </p>
 
 <p><b>Finngen:</b> TSV data can be requested via the form at https://finngen.gitbook.io/documentation/data-download which triggers an email with the download link.</p>
 
 <p><b>SweGen:</b> We are not allowed to redistribute the VCF, you can request it at <a target=_blank href="https://swefreq.nbis.se/dataset/SweGen">SweGen</a>, alongside the VCF file. </p>