bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/hgsvc3Sv.html src/hg/makeDb/trackDb/human/hgsvc3Sv.html index df83bb19c5a..f262fc2bbfe 100644 --- src/hg/makeDb/trackDb/human/hgsvc3Sv.html +++ src/hg/makeDb/trackDb/human/hgsvc3Sv.html @@ -51,55 +51,72 @@ deletions only).
  • Inner Inversion Region: for inversions, the coordinate range of the inner inverted sequence, distinct from the outer breakpoint interval.
  • Transposable Element: when the inserted or deleted sequence was classified as a known TE family.
  • Segmental Duplication Overlap: fraction of the variant interval overlapping UCSC segmental duplications in the reference.
  • Carrier Haplotypes: full list of haplotype IDs (e.g. HG00096-h1, HG00096-h2, HG00514-un) carrying the variant.
  • Methods

    -HGSVC3 produced haplotype-resolved de novo assemblies for 65 samples -spanning five continental groups. Assemblies were built from PacBio HiFi -and Oxford Nanopore reads, phased with Strand-seq and further validated -with Hi-C and optical mapping. Structural variants were called by aligning -each haplotype back to the reference with PAV v2.4.0.1; calls were then -cross-referenced with ten independent callers. The final annotation tables -(this track's input) include merge statistics (MERGE_RO, MERGE_OFFSET, -MERGE_SZRO, MERGE_OFFSZ, MERGE_MATCH) that describe how well each -per-sample call matched the merged consensus site. +Logsdon et al. 2025 produced fully phased hybrid de novo assemblies for 65 +diverse individuals (63 from 1kGP, NA21487 from HapMap, and HG002 from +GIAB), using PacBio HiFi (Sequel II/Revio, 30-h movies), Oxford Nanopore +ultra-long sequencing (R9.4.1 PromethION, 96-h runs), Bionano optical +mapping (DLE-1 on Saphyr 2nd-gen), Strand-seq, Hi-C (Proximo) and Iso-Seq. +Assemblies were generated with Verkko v1.4.1 (primary) and hifiasm-UL +v0.19.6 (complementary, especially for centromeres and Yq12), phased with +the Graphasing pipeline v0.3.1-alpha, and produced 130 haplotype +assemblies with median N50 of 130 Mbp that close 92% of previous assembly +gaps (39% of chromosomes at telomere-to-telomere status). SVs were called +against GRCh38 and T2T-CHM13 with PAV v2.4.1 (plus DipCall and SVIM-asm +from the same alignments) and cross-validated with an additional ten +callers (PBSV, Sniffles, Delly, cuteSV, DeBreak, SVIM, DeepVariant, +Clair3, PEPPER-Margin-DeepVariant for ONT and MELT-LRA/PALMER2 for MEIs). +Calls were merged with SV-Pop and centromere-satellite / telomere hits +were filtered. The final GRCh38 release contains 176,232 DEL+INS plus 300 +INV (176,532 SVs total); the T2T-CHM13 release contains 188,224 DEL+INS +plus 276 INV (188,500 SVs total).

    -Two tables were merged for display here: -variants_GRCh38_sv_insdel_HGSVC2024v1.0.tsv.gz (DEL + INS, 176,232 -records) and variants_GRCh38_sv_inv_HGSVC2024v1.0.tsv.gz (INV, 300 -records). Type-specific columns (HOM_REF/HOM_TIG/TE for insdel; -RGN_REF_INNER for inversions) are shown as empty on the detail page when -they do not apply. +For display, the two final HGSVC3 v1.0 annotation tables +variants_GRCh38_sv_insdel_HGSVC2024v1.0.tsv.gz and +variants_GRCh38_sv_inv_HGSVC2024v1.0.tsv.gz were downloaded from +the +IGSR HGSVC3 GRCh38 release directory and merged into a single bigBed. +The hs1 version uses the parallel +variants_T2T-CHM13_sv_insdel_HGSVC2024v1.0.tsv.gz and +variants_T2T-CHM13_sv_inv_HGSVC2024v1.0.tsv.gz tables from the + +HGSVC3 T2T-CHM13 release directory; no liftOver is involved on hs1. +Type-specific columns (HOM_REF/HOM_TIG/TE for insdel; RGN_REF_INNER for +inversions) are empty on the detail page when they do not apply.

    -The hs1 (T2T-CHM13) version of this track uses the same merge pipeline on -the HGSVC3 T2T-CHM13 tables -(variants_T2T-CHM13_sv_insdel_HGSVC2024v1.0.tsv.gz and -variants_T2T-CHM13_sv_inv_HGSVC2024v1.0.tsv.gz) downloaded from - -the HGSVC3 T2T-CHM13 release directory. +The step-by-step build commands (download, format conversion, bigBed build) +are recorded in the UCSC makeDoc for this track container: + +doc/hg38/lrSv.txt and + +doc/hs1/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.

    Data Access

    The data can be explored interactively in table format with the Table Browser or the Data Integrator, and accessed programmatically through our API, track=hgsvc3Sv.

    The bigBed is available from our download server for both assemblies: