7594507ca126d5242346787e42e13c52ea7709b1
max
Fri Apr 17 08:40:31 2026 -0700
Add lrSv supertrack: long-read structural variants from 9 studies (hg38).
#Preview2 week - bugs introduced now will need a build patch to fix
Sub-tracks (all bigBed 9+):
han945Sv - 945 Han Chinese, ONT (Gong 2025, PMID 39929826)
lrSv1kgOnt - 1019 1000 Genomes, ONT, SVAN-annotated (Schloissnig 2025,
PMID 40702182; lifted from hs1)
tommoJpSv - 333 Japanese (111 trios), ONT (Otsuki 2022, PMID 36127505)
aou1kSv - 1027 All of Us, PacBio HiFi (Garimella 2025, PMID 41256123)
ga4kSv - 502 GA4K pediatric rare disease, PacBio HiFi
(Cohen 2022, PMID 35305867)
decodeSv - 3622 Icelanders, ONT (Beyter 2021, PMID 33972781)
hgsvc3Sv - 65 HGSVC3 diverse haplotype-resolved assemblies, HiFi+ONT
(Logsdon 2025, PMID 40702183; merges insdel+inv tables)
kwanhoSv - 100 post-mortem brains (PD/ILBD/HC), PacBio HiFi
(Kim 2026, PMID 41929179)
chirmade101Sv - 101 long-read WGS GWAS SVatalog cohort
(Chirmade 2026, PMID 41203876)
Includes per-track conversion scripts and autoSql under
scripts/lrSv/, the supertrack summary table in lrSv.html, and a
consolidated makeDoc at doc/hg38/lrSv.txt.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context)
+This track shows structural variants (SVs) identified by PacBio HiFi long-read
+sequencing of probands and their families enrolled in the Genomic Answers for
+Kids (GA4K) program at Children's Mercy Research Institute. GA4K is a
+longitudinal pediatric genomics initiative that aims to enroll 30,000 children
+with suspected rare genetic disorders, together with their parents, to build
+a large-scale resource of clinical and genomic data.
+
+The callset contains 115,554 SVs (52,564 deletions, 58,219 insertions, 4,408
+duplications, 363 inversions) from 502 sequenced samples. Variants are
+site-level (no per-sample genotypes) and each SV has been replicated, meaning
+that it was either observed in two or more unrelated GA4K individuals, or
+matched an SV from an external long-read reference set (Decode or the Human
+Pangenome Reference Consortium).
+
+Items are colored by SV type:
+Description
+Display Conventions and Configuration
+
+
+
+Insertions are placed at the insertion site with a width of 1 bp; deletions, +duplications and inversions span the affected interval. Filters are available +for SV type, SV length, carrier-sample count and allele frequency. The detail +page also shows the total number of samples genotyped at each site. +
+ ++Samples were sequenced on PacBio Revio and Sequel II instruments with HiFi +chemistry. Single-sample SV callsets were produced with pbsv and then merged +across the cohort with JASMINE v1.1.4 (jasmine --output-genotypes), +which clusters equivalent SVs across samples and writes a site-level multi-sample +VCF. +
++To reduce false positives, the merged VCF was filtered to retain only SVs that +were replicated in at least two independent observations: either (1) matching a +second SV from another unrelated Children's Mercy (CMH) individual within the +same Jasmine cluster, or (2) matching an SV from the Decode Icelandic or Human +Pangenome Reference Consortium (HPRC) callsets using +svpack match with default settings. +
++Carrier counts (SVC), total sample counts (SVN) and allele frequencies +(SVF = SVC/SVN) were recomputed on the replicated callset. +
+ ++The data can be explored interactively in table format with the +Table Browser or the +Data Integrator and exported from there +to spreadsheet or tab-sep tables. From scripts, the data can be accessed +through our API, track=ga4kSv. +
++For automated download and analysis, the annotation is stored in a bigBed file +that can be downloaded from +our +download server. The file for this track is called ga4kSv.bb. +Individual regions or the whole annotation can be obtained using the +bigBedToBed utility, available as a precompiled binary or from source +as described on our +utilities +page. +Example: +bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/ga4kSv.bb -chrom=chr21 -start=0 -end=100000000 stdout. +
++The original VCF is available from the Children's Mercy Research Institute +GA4K data release at + +github.com/ChildrensMercyResearchInstitute/GA4K. +
+ ++Thanks to the Children's Mercy Research Institute and the Genomic Answers +for Kids participants and their families for making this dataset publicly +available. +
+ ++Cohen ASA, Farrow EG, Abdelmoity AT, Alaimo JT, Amudhavalli SM, Anderson JT, Bansal L, Bartik L, +Baybayan P, Belden B et al. + +Genomic answers for children: Dynamic analyses of >1000 pediatric rare disease genomes. +Genet Med. 2022 Jun;24(6):1336-1348. +PMID: 35305867 +
+