bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/hgsvc3Sv.html src/hg/makeDb/trackDb/human/hgsvc3Sv.html index df83bb19c5a..f262fc2bbfe 100644 --- src/hg/makeDb/trackDb/human/hgsvc3Sv.html +++ src/hg/makeDb/trackDb/human/hgsvc3Sv.html @@ -1,138 +1,155 @@

Description

This track shows structural variants (SVs) from the third phase of the Human Genome Structural Variation Consortium (HGSVC3). The callset comes from 65 diverse individuals across five continental groups, each sequenced with PacBio HiFi (~47x), Oxford Nanopore ultra-long reads (~56x) and complemented with Strand-seq, optical mapping, Hi-C and Iso-Seq for haplotype-resolved assembly. SVs were discovered from the de novo assemblies with PAV v2.4.0.1 and cross-validated by ten additional orthogonal callers.

The track merges the two final SV annotation tables from the HGSVC3 v1.0 release on GRCh38: 176,232 insertions/deletions and 300 inversions, for a total of 176,532 SVs. Each row is a site-level variant with the list of carrier haplotypes and additional structural annotations.

The same track is also available natively on the T2T-CHM13 (hs1) assembly: HGSVC3 independently aligned all haplotype-resolved assemblies to both GRCh38 and T2T-CHM13 and released a separate set of annotation tables per reference. The hs1 track is built directly from the HGSVC3 T2T-CHM13 annotation tables (188,224 DEL+INS and 276 INV; 188,500 SVs total) — no liftOver is involved.

Display Conventions and Configuration

Items are colored by SV type:

Insertions are placed at the insertion site with a width of 1 bp; deletions and inversions span the affected reference interval. Filters are available for SV type, SV length, carrier-haplotype count, distinct sample count, whether the site falls in a Tandem Repeat Finder region and the fraction of the variant overlapping segmental duplications.

The detail page shows, where available:

Methods

-HGSVC3 produced haplotype-resolved de novo assemblies for 65 samples -spanning five continental groups. Assemblies were built from PacBio HiFi -and Oxford Nanopore reads, phased with Strand-seq and further validated -with Hi-C and optical mapping. Structural variants were called by aligning -each haplotype back to the reference with PAV v2.4.0.1; calls were then -cross-referenced with ten independent callers. The final annotation tables -(this track's input) include merge statistics (MERGE_RO, MERGE_OFFSET, -MERGE_SZRO, MERGE_OFFSZ, MERGE_MATCH) that describe how well each -per-sample call matched the merged consensus site. +Logsdon et al. 2025 produced fully phased hybrid de novo assemblies for 65 +diverse individuals (63 from 1kGP, NA21487 from HapMap, and HG002 from +GIAB), using PacBio HiFi (Sequel II/Revio, 30-h movies), Oxford Nanopore +ultra-long sequencing (R9.4.1 PromethION, 96-h runs), Bionano optical +mapping (DLE-1 on Saphyr 2nd-gen), Strand-seq, Hi-C (Proximo) and Iso-Seq. +Assemblies were generated with Verkko v1.4.1 (primary) and hifiasm-UL +v0.19.6 (complementary, especially for centromeres and Yq12), phased with +the Graphasing pipeline v0.3.1-alpha, and produced 130 haplotype +assemblies with median N50 of 130 Mbp that close 92% of previous assembly +gaps (39% of chromosomes at telomere-to-telomere status). SVs were called +against GRCh38 and T2T-CHM13 with PAV v2.4.1 (plus DipCall and SVIM-asm +from the same alignments) and cross-validated with an additional ten +callers (PBSV, Sniffles, Delly, cuteSV, DeBreak, SVIM, DeepVariant, +Clair3, PEPPER-Margin-DeepVariant for ONT and MELT-LRA/PALMER2 for MEIs). +Calls were merged with SV-Pop and centromere-satellite / telomere hits +were filtered. The final GRCh38 release contains 176,232 DEL+INS plus 300 +INV (176,532 SVs total); the T2T-CHM13 release contains 188,224 DEL+INS +plus 276 INV (188,500 SVs total).

-Two tables were merged for display here: -variants_GRCh38_sv_insdel_HGSVC2024v1.0.tsv.gz (DEL + INS, 176,232 -records) and variants_GRCh38_sv_inv_HGSVC2024v1.0.tsv.gz (INV, 300 -records). Type-specific columns (HOM_REF/HOM_TIG/TE for insdel; -RGN_REF_INNER for inversions) are shown as empty on the detail page when -they do not apply. +For display, the two final HGSVC3 v1.0 annotation tables +variants_GRCh38_sv_insdel_HGSVC2024v1.0.tsv.gz and +variants_GRCh38_sv_inv_HGSVC2024v1.0.tsv.gz were downloaded from +the +IGSR HGSVC3 GRCh38 release directory and merged into a single bigBed. +The hs1 version uses the parallel +variants_T2T-CHM13_sv_insdel_HGSVC2024v1.0.tsv.gz and +variants_T2T-CHM13_sv_inv_HGSVC2024v1.0.tsv.gz tables from the + +HGSVC3 T2T-CHM13 release directory; no liftOver is involved on hs1. +Type-specific columns (HOM_REF/HOM_TIG/TE for insdel; RGN_REF_INNER for +inversions) are empty on the detail page when they do not apply.

-The hs1 (T2T-CHM13) version of this track uses the same merge pipeline on -the HGSVC3 T2T-CHM13 tables -(variants_T2T-CHM13_sv_insdel_HGSVC2024v1.0.tsv.gz and -variants_T2T-CHM13_sv_inv_HGSVC2024v1.0.tsv.gz) downloaded from - -the HGSVC3 T2T-CHM13 release directory. +The step-by-step build commands (download, format conversion, bigBed build) +are recorded in the UCSC makeDoc for this track container: + +doc/hg38/lrSv.txt and + +doc/hs1/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.

Data Access

The data can be explored interactively in table format with the Table Browser or the Data Integrator, and accessed programmatically through our API, track=hgsvc3Sv.

The bigBed is available from our download server for both assemblies:

Example: bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/hgsvc3.bb -chrom=chr21 -start=0 -end=100000000 stdout.

The original annotation tables are available from the HGSVC3 release on the IGSR FTP site.

Credits

Thanks to the Human Genome Structural Variation Consortium (HGSVC) and all participating sequencing and analysis centers for making the HGSVC3 annotation tables publicly available.

References

Logsdon GA, Ebert P, Audano PA, Loftus M, Porubsky D, Ebler J, Yilmaz F, Hallast P, Prodanov T, Yoo D et al. Complex genetic variation in nearly complete human genomes. Nature. 2025 Aug;644(8076):430-441. PMID: 40702183; PMC: PMC12350169