f058c8fe4601b223ff47468eb3525c05ccd03850 max Wed Apr 22 09:17:17 2026 -0700 srSv: new short-read SV supertrack, split out of lrSv Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr, tommoJpCnv) out of the Long-read SV supertrack into a new sibling supertrack srSv (Short-read SVs), so the lrSv collection contains only long-read callsets. Filter fields (svType, svLen, insLen, AC) are mirrored at the srSv supertrack level to keep the UX parallel to lrSv. - trackDb: new human/srSv.ra with the three subtrack stanzas and updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas removed from human/lrSv.ra. human/trackDb.ra now includes srSv.ra. Also a new human/srSv.html overview page; the SR rows and SR-specific paragraphs removed from human/lrSv.html. - Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/ {lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to scripts/srSv/ with git mv (history preserved) and renamed to drop the "lrSv" prefix. Internal path references in abelSvBuild.sh and abelSvVcfToBed.py updated. - makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and extended with the onekg3202Sr and tommoJpCnv sections moved from lrSv.txt. lrSv.txt leaves a pointer. - Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr, lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*. /gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and /gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/. refs #36258 diff --git src/hg/makeDb/trackDb/human/srSv.html src/hg/makeDb/trackDb/human/srSv.html new file mode 100644 index 00000000000..252ace3e465 --- /dev/null +++ src/hg/makeDb/trackDb/human/srSv.html @@ -0,0 +1,105 @@ +
+This track collection contains structural variant (SV) and copy-number variant +(CNV) callsets derived from Illumina short-read sequencing. Most SV +tracks in the browser now come from long-read platforms (see the companion +Long-read SVs supertrack); the short-read +callsets here are included as comparators so users can evaluate the extra +sensitivity of long-read calls and cross-check a variant across technologies. +
+ ++SV length statistics (min / median / max) are computed from the svLen +field of each track, in base pairs. For the Abel CCDG callset, a large +fraction of records are breakend (BND) translocations where svLen=-1 +is used as a sentinel, which shows up in both min and median. +
+| Dataset | +N samples | +Cohort / disease | +Sequencing | +SVs | +Min | +Median | +Max | +
|---|---|---|---|---|---|---|---|
| CCDG 17,795 | +17,795 | +NHGRI CCDG + PAGE + SGDP (B38 native + B37 lifted) | +Illumina short-read (LUMPY + CNVnator + svtyper) | +737,998 | +-1 | +-1 | +217,985,413 | +
| 1KG 3202 | +3,202 | +1000 Genomes expanded cohort | +Illumina short-read (GATK-SV) | +173,366 | +1 | +314 | +154,807,729 | +
| ToMMo 48K CNV | +48,874 | +Japanese, general population | +Illumina short-read (GATK CNV, 1 kb bins, shown as two bigWigs) | +~2M bins with CNV carriers; not comparable to per-SV counts above | +|||
+Site-frequency callset from 17,795 deeply sequenced genomes (Abel et al. 2020, +Nature; PMID 32460305). Two non-overlapping public releases are combined in +this track: the B38 callset (14,623 samples called natively on GRCh38) and the +B37 callset (8,417 samples, lifted). Variants are colored by SV type +(DEL / DUP / INV / MEI / BND) and carry per-population allele counts for eight +ancestry groups plus a HIGH/LOW confidence filter. +
+ ++1000 Genomes 3202-sample Illumina short-read GATK-SV callset (Byrska-Bishop +et al. 2022). 173,366 SVs across 7 classes (DEL, INS, DUP, INV, CPX, CNV, +CTX) with AC/AN/AF and per-superpopulation AFs (AFR/AMR/ASN/EUR/SAN). +
+ ++Per-1 kb-bin copy-number carrier counts from short-read whole-genome +sequencing of 48,874 Japanese individuals (jMorp 48KJPN-CNV Frequency Panel, +release 20230828), called with GATK CNV germline workflows. Shown as a +multiWig overlay: red = samples with copy-number loss (CN<2) per bin, +green = samples with gain (CN>2) per bin. This is a useful short-read +point of comparison to the ToMMo 333-sample long-read SV track under the +Long-read SVs supertrack. +
+ ++See the Data Access section of each subtrack's page for download links. +Build documentation lives alongside the scripts at + +doc/hg38/srSv.txt; conversion scripts and autoSql schemas are at + +makeDb/scripts/srSv. +
+ ++Each subtrack credits its respective upstream project; see the individual +description pages. +
+ ++See the individual subtrack description pages for the specific references. +