151410cc48b9b1f8b1cb9bee89b7004eca871c61 max Wed Apr 22 09:03:35 2026 -0700 lrSv: harmonize long-read shortLabels, add aprSv/cpc1Sv/abelSv to overview Normalize the shortLabel text of every long-read subtrack to the pattern " SVs" (no commas in N): CoLoRSdb 1427, AoU 1027, ToMMo 333, GA4K 502, deCODE 3622, HPRC v2 233, Kim PD 100 prelim. Short-read comparators (abelSv, onekg3202Sr, tommoJpCnv) are left alone per user instruction. Also add three rows that were missing from lrSv.html's overview table: aprSv (Arab APR 53), cpc1Sv (CPC 58, HPRC-specific SVs removed) and abelSv (CCDG 17,795 Illumina short-read comparator). Updates the comparator footnote to mention both short-read rows. refs #36258 diff --git src/hg/makeDb/trackDb/human/lrSv.html src/hg/makeDb/trackDb/human/lrSv.html index ebaced9d96e..11b1e7ea0b0 100644 --- src/hg/makeDb/trackDb/human/lrSv.html +++ src/hg/makeDb/trackDb/human/lrSv.html @@ -3,32 +3,33 @@ This track collection contains structural variant (SV) calls derived from long-read sequencing studies. Structural variants are genomic rearrangements larger than ~50 bp, including deletions, insertions, duplications, inversions, and translocations. Long-read sequencing technologies can span repetitive regions and resolve complex rearrangements that are difficult to detect with short-read methods.

Available Datasets

SV length statistics (min / median / max) are computed from the svLen field of each track, in base pairs. Some tracks include sites with svLen=0 (complex events where the reference and alternate alleles differ in sequence but not in length).

-All subtracks below are long-read callsets, except the last row (1KG 3202, -Illumina short-read), which is included as a short-read comparator. +All subtracks below are long-read callsets, except the last two rows +(CCDG 17,795 and 1KG 3202, both Illumina short-read), which are +included as short-read comparators.

@@ -134,50 +135,80 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Dataset N samples Cohort / disease Sequencing SVs Min Median Max
CoLoRSdb 1,427111,746 50 168 57,207,414
HGSVC3 65 HGSVC3 diverse reference assemblies PacBio HiFi + ONT 176,531 50 154 30,176,500
Arab APR53UAE-resident Arabs from 8 countries (Arab Pangenome Reference)PacBio HiFi + ONT + Hi-C (pangenome graph)72,65612199,885
CPC58Chinese Pangenome Consortium, 36 minority ethnic groups (HPRC-specific SVs removed)PacBio HiFi (pangenome graph)36,0301538,998,096
Kim PD Brain 100 Parkinson's disease, ILBD, controls (post-mortem brain) PacBio HiFi 74,552 50 160 190,088,222
SVatalog 101 101 Long-read WGS cohort for GWAS LD fine-mapping (SickKids) long-read 87,183 4 160 1,321,484
CCDG 17,795 (short-read)17,795NHGRI CCDG + PAGE + SGDP (short-read comparator)Illumina short-read737,998-1-1217,985,413
1KG 3202 (short-read) 3,202 1000 Genomes expanded cohort (short-read comparator) Illumina short-read 173,366 1 314 154,807,729

CoLoRSdb SVs (colorsDbSv)

Structural variants from the Consortium of Long-Read Sequencing database