9a11061ca6b40fe16bdfd09b1af53192f6c7c85b
max
Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context)
This track shows structural variants (SVs) from the third phase of the
Human Genome Structural Variation Consortium (HGSVC3). The callset comes
from 65 diverse individuals across five continental groups, each sequenced
with PacBio HiFi (~47x), Oxford Nanopore ultra-long reads (~56x) and
complemented with Strand-seq, optical mapping, Hi-C and Iso-Seq for
haplotype-resolved assembly. SVs were discovered from the de novo assemblies
with PAV v2.4.0.1 and cross-validated by ten additional orthogonal callers.
The track merges the two final SV annotation tables from the HGSVC3 v1.0
release on GRCh38: 176,232 insertions/deletions and 300 inversions, for a
total of 176,532 SVs. Each row is a site-level variant with the list of
carrier haplotypes and additional structural annotations.
+The same track is also available natively on the T2T-CHM13 (hs1)
+assembly: HGSVC3 independently aligned all haplotype-resolved assemblies
+to both GRCh38 and T2T-CHM13 and released a separate set of annotation
+tables per reference. The hs1 track is built directly from the
+
+HGSVC3 T2T-CHM13 annotation tables (188,224 DEL+INS and 276 INV;
+188,500 SVs total) — no liftOver is involved.
+
Items are colored by SV type:
Display Conventions and Configuration
Insertions are placed at the insertion site with a width of 1 bp; deletions and inversions span the affected reference interval. Filters are available for SV type, SV length, carrier-haplotype count, distinct sample count, whether the site falls in a Tandem Repeat Finder region and the fraction @@ -60,44 +69,58 @@ with Hi-C and optical mapping. Structural variants were called by aligning each haplotype back to the reference with PAV v2.4.0.1; calls were then cross-referenced with ten independent callers. The final annotation tables (this track's input) include merge statistics (MERGE_RO, MERGE_OFFSET, MERGE_SZRO, MERGE_OFFSZ, MERGE_MATCH) that describe how well each per-sample call matched the merged consensus site.
Two tables were merged for display here: variants_GRCh38_sv_insdel_HGSVC2024v1.0.tsv.gz (DEL + INS, 176,232 records) and variants_GRCh38_sv_inv_HGSVC2024v1.0.tsv.gz (INV, 300 records). Type-specific columns (HOM_REF/HOM_TIG/TE for insdel; RGN_REF_INNER for inversions) are shown as empty on the detail page when they do not apply.
++The hs1 (T2T-CHM13) version of this track uses the same merge pipeline on +the HGSVC3 T2T-CHM13 tables +(variants_T2T-CHM13_sv_insdel_HGSVC2024v1.0.tsv.gz and +variants_T2T-CHM13_sv_inv_HGSVC2024v1.0.tsv.gz) downloaded from + +the HGSVC3 T2T-CHM13 release directory. +
The data can be explored interactively in table format with the Table Browser or the Data Integrator, and accessed programmatically through our API, track=hgsvc3Sv.
-The bigBed is available from -our -download server as hgsvc3.bb. Example: -bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/hgsvc3.bb -chrom=chr21 -start=0 -end=100000000 stdout. +The bigBed is available from our download server for both assemblies: +
The original annotation tables are available from the HGSVC3 release on the IGSR FTP site.
Thanks to the Human Genome Structural Variation Consortium (HGSVC) and all participating sequencing and analysis centers for making the HGSVC3 annotation tables publicly available.