ac18a42f0dafb4febaaeaebcd53fe75df9b83234 max Mon May 11 08:29:14 2026 -0700 lrSv: add Coverage column, drop redundant Sequencing column, rename SVs to SV count Coverage values pulled from the per-subtrack methods sections and the underlying papers (Han 17x, deCODE 17x, GA4K 27x, HPRC 60x HiFi + 30x ONT, etc.). Sequencing technology is now folded into the Coverage cells. Also cross-links to the new HGSVC3 Mobile Insertions tracks. refs #36642 diff --git src/hg/makeDb/trackDb/human/lrSv.html src/hg/makeDb/trackDb/human/lrSv.html index ae8d30a51e9..ebf9c5bdf1f 100644 --- src/hg/makeDb/trackDb/human/lrSv.html +++ src/hg/makeDb/trackDb/human/lrSv.html @@ -7,213 +7,221 @@ that are difficult to detect with short-read methods.

Available Datasets

SV length statistics (min / median / max) are computed from the svLen field of each track, in base pairs. Some tracks include sites with svLen=0 (complex events where the reference and alternate alleles differ in sequence but not in length).

For short-read structural-variant comparators (CCDG 17,795, 1KG 3202, ToMMo 48K CNV) see the companion Short-read SVs supertrack.

+

+Polymorphic Mobile Element Insertions (Alu, L1, SVA, HERVK, +snRNA) called from HGSVC3 long-read assemblies are released as a +separate track collection — see the +Mobile Insertions tracks. Those MEIs are +the insertions identified in the 65 HGSVC3 samples relative to the +reference, available on both GRCh38/hg38 and T2T-CHM13/hs1. +

- - + + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
Dataset N samples Cohort / disease Disease casesSequencingSVsCoverageSV count Min Median Max
All merged All long-read SV datasets merged on identical position+type+length, with per-database AC mixed mixed (PacBio HiFi, ONT) 2,694,871 50 200 190,088,223
CoLoRSdb 1,427 Consortium of Long-Read Sequencing, joint callset NoPacBio HiFimixed (HiFi) 426,239 20 33 101,381
Han 945 945 Han Chinese, general population NoONT (PromethION)~17x ONT 111,288 0 254 99,743
1KG ONT 100 1001000 Genomes, 5 superpopulations / 19 subpop., high 37x seq. coverage1000 Genomes, 5 superpopulations / 19 subpopulations NoONT (R9.4.1)~37x ONT (R9.4.1) 113,696 0 164 98,289
1KG ONT Vienna 1,0191000 Genomes, diverse, normal 17x seq. coverage1000 Genomes, diverse NoONT~17x ONT 148,375 2 177 49,171
ToMMo Japanese 333 (111 trios) Japanese, general population NoONT~22x ONT 74,201 51 162 99,980
AoU 1K 1,027All of Us, self-identified Black/African American, 8x cov.; biobank includes a variety of conditions (diabetes, hearing loss, etc.)All of Us, self-identified Black/African American; biobank includes a variety of conditions (diabetes, hearing loss, etc.) Yes (mixed)PacBio HiFi~8x HiFi 541,049 50 152 9,998
GA4K 502 Children's Mercy, pediatric rare disease probands + families Yes (probands)PacBio HiFi~27x HiFi 115,554 50 186 809,711
deCODE 3,622 3,622 Icelandic general population NoONT~17x ONT 133,886 0 127 861,080
HPRC v2 233 HPRC release-2 pangenome (CHM13 + diverse 1KG assemblies) NoPacBio HiFi (pangenome graph)~60x HiFi + ~30x ONT (pangenome graph) 1,483,114 50 280 97,718
HGSVC2 32 HGSVC2 haplotype-resolved assemblies (5 superpopulations) NoPacBio CLR + HiFi + Strand-seq>40x PacBio CLR + >20x HiFi (+ Strand-seq) 111,746 50 168 57,207,414
HGSVC3 65 HGSVC3 diverse reference assemblies NoPacBio HiFi + ONT~47x HiFi + ~56x ONT 176,531 50 154 30,176,500
Arab UPR 53 UAE-resident Arabs from 8 countries (UAE Pangenome Reference) NoPacBio HiFi + ONT + Hi-C (pangenome graph)~35x HiFi + ~54x ONT (+ Hi-C, pangenome graph) 72,656 1 21 99,885
CPC 58 Chinese Pangenome Consortium, 36 minority ethnic groups (HPRC-specific SVs removed) NoPacBio HiFi (pangenome graph)~30x HiFi (pangenome graph) 36,030 1 53 8,998,096
Kim PD Brain 100 Parkinson's disease, ILBD, controls (post-mortem brain) Yes (PD + ILBD)PacBio HiFi~17x HiFi 74,552 50 160 190,088,222
SVatalog 101 101 Cystic fibrosis (CF) patients from the CF Canada-Sick Kids Program in Individual CF Therapy (CFIT). Long-read WGS used for GWAS LD fine-mapping Yes (all CF)long-read~50x PacBio CLR (34, Sequel I) + ~76x HiFi (67, Sequel II) 87,183 4 160 1,321,484

Note: there is likely some overlap in sample composition across these collections. For example, 1000 Genomes samples are also included in HPRC and CoLoRSdb.

CoLoRSdb SVs

Structural variants from the Consortium of Long-Read Sequencing database