7594507ca126d5242346787e42e13c52ea7709b1
max
  Fri Apr 17 08:40:31 2026 -0700
Add lrSv supertrack: long-read structural variants from 9 studies (hg38).

#Preview2 week - bugs introduced now will need a build patch to fix
Sub-tracks (all bigBed 9+):
han945Sv     - 945 Han Chinese, ONT (Gong 2025, PMID 39929826)
lrSv1kgOnt   - 1019 1000 Genomes, ONT, SVAN-annotated (Schloissnig 2025,
PMID 40702182; lifted from hs1)
tommoJpSv    - 333 Japanese (111 trios), ONT (Otsuki 2022, PMID 36127505)
aou1kSv      - 1027 All of Us, PacBio HiFi (Garimella 2025, PMID 41256123)
ga4kSv       - 502 GA4K pediatric rare disease, PacBio HiFi
(Cohen 2022, PMID 35305867)
decodeSv     - 3622 Icelanders, ONT (Beyter 2021, PMID 33972781)
hgsvc3Sv     - 65 HGSVC3 diverse haplotype-resolved assemblies, HiFi+ONT
(Logsdon 2025, PMID 40702183; merges insdel+inv tables)
kwanhoSv     - 100 post-mortem brains (PD/ILBD/HC), PacBio HiFi
(Kim 2026, PMID 41929179)
chirmade101Sv - 101 long-read WGS GWAS SVatalog cohort
(Chirmade 2026, PMID 41203876)

Includes per-track conversion scripts and autoSql under
scripts/lrSv/, the supertrack summary table in lrSv.html, and a
consolidated makeDoc at doc/hg38/lrSv.txt.

refs #36258

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

diff --git src/hg/makeDb/trackDb/human/aou1kSv.html src/hg/makeDb/trackDb/human/aou1kSv.html
new file mode 100644
index 00000000000..bafede33425
--- /dev/null
+++ src/hg/makeDb/trackDb/human/aou1kSv.html
@@ -0,0 +1,95 @@
+<h2>Description</h2>
+<p>
+This track shows structural variants (SVs) identified by PacBio HiFi long-read
+sequencing of 1,027 individuals from the All of Us (AoU) Research Program.
+Participants self-identified as Black or African American and were sequenced
+to ~8x coverage. The dataset contains 541,049 SVs (444,524 insertions and
+96,525 deletions) on autosomes.
+</p>
+<p>
+SVs are annotated with population-specific allele frequencies across five
+ancestry groups (African, Admixed American, East Asian, European, South Asian),
+gene intersections from curated disease gene lists (OMIM, ACMG, cancer genes),
+regulatory element overlaps, and associations with eQTLs, GWAS loci, and
+clinical phenotypes from the AoU electronic health records.
+</p>
+
+<h2>Display Conventions and Configuration</h2>
+<p>
+Items are colored by SV type:
+<ul>
+<li><span style="color: rgb(200,0,0);">Deletions (DEL)</span> - red</li>
+<li><span style="color: rgb(0,0,200);">Insertions (INS)</span> - blue</li>
+</ul>
+</p>
+<p>
+Filters are available for SV type, SV length, and population-specific allele
+frequencies. For insertions, the item is placed at the insertion site with a
+width of 1 bp; for deletions, the item spans the deleted region.
+</p>
+<p>
+The detail page shows the following annotations when available:
+<ul>
+<li><b>Population Allele Frequencies</b>: separate frequencies for AFR, AMR,
+EAS, EUR, and SAS ancestry groups</li>
+<li><b>Fst</b>: fixation index between African and non-African populations</li>
+<li><b>Gene Intersections</b>: overlapping OMIM, disease, cancer, and ACMG
+genes with constraint scores (pLI and LOEUF)</li>
+<li><b>Regulatory Elements</b>: intersected regulatory elements (e.g. enhancer,
+promoter)</li>
+<li><b>Other LR Datasets</b>: whether the SV was also detected in HPRC, HGSVC,
+or 1KG-ONT long-read datasets</li>
+<li><b>eQTLs</b>: expression QTL associations with q-values</li>
+<li><b>GWAS Associations</b>: overlapping GWAS loci with trait, gene, rsID,
+and LD information</li>
+<li><b>SV-Trait Associations</b>: associations with clinical phenotypes from
+AoU electronic health records, including odds ratios and confidence
+intervals</li>
+</ul>
+</p>
+
+<h2>Methods</h2>
+<p>
+PacBio HiFi long-read sequencing was performed on 1,027 AoU participants
+self-identifying as Black or African American, at a median coverage of ~8x.
+SV calling was performed using a cohort-level pipeline, producing calls for
+insertions and deletions. Allele frequencies were computed separately for
+five ancestry groups. SVs were annotated with gene intersections from OMIM,
+disease gene panels, cancer gene lists, and ACMG actionable genes, along
+with regulatory element overlaps and segmental duplication associations.
+</p>
+<p>
+A scalable imputation workflow was developed to impute over 750,000 SVs into
+existing short-read whole-genome sequencing datasets. SV-trait associations
+were tested in 848 AoU participants with matched electronic health records,
+identifying 291 significant associations across 226 conditions.
+</p>
+
+<h2>Data Access</h2>
+<p>
+This track was built from supplementary data (media-2) of the AoU long-read
+sequencing preprint. Access to the full AoU dataset requires registration
+through the <a href="https://www.researchallofus.org/" target="_blank">All of
+Us Research Hub</a>.
+</p>
+
+<h2>Credits</h2>
+<p>
+Thanks to Garimella et al. and the All of Us Research Program for making their
+structural variant annotations publicly available.
+</p>
+
+<h2>References</h2>
+
+
+
+<p>
+Garimella KV, Li Q, Wertz J, Lee SK, Cunial F, Huang Y, Mostovoy Y, Lorig-Roach R, English A, Su H
+<em>et al</em>.
+<a href="https://doi.org/10.1101/2025.10.02.25336942" target="_blank">
+Population-scale Long-read Sequencing in the All of Us Research Program</a>.
+<em>medRxiv</em>. 2025 Oct 5;.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/41256123" target="_blank">41256123</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12622093/" target="_blank">PMC12622093</a>
+</p>
+