bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/lrSv1kgOnt.html src/hg/makeDb/trackDb/human/lrSv1kgOnt.html index b678593dd8c..9a379e674a6 100644 --- src/hg/makeDb/trackDb/human/lrSv1kgOnt.html +++ src/hg/makeDb/trackDb/human/lrSv1kgOnt.html @@ -2,84 +2,104 @@
This track shows structural variants (SVs) identified by Oxford Nanopore long-read sequencing of 1,019 individuals from the 1000 Genomes Project, representing 26 populations across 5 continental regions: Africa (275 samples), East Asia (192), South Asia (199), Europe (189), and Americas (164). Median sequencing coverage was 16.9x per sample with a median N50 read length of 20.3 kb.
SVs were discovered using the SAGA framework (SV Analysis by Graph Augmentation) and annotated with SVAN, which classifies insertions and deletions by their mechanism of origin. The dataset contains 161,332 annotated SVs, including 75,324 insertions, 66,192 deletions, and 19,816 complex rearrangements. The original coordinates are on the T2T-CHM13 assembly (hs1); for GRCh38 (hg38), coordinates were converted using liftOver (148,375 records mapped successfully).
++The 1,019 samples sequenced here are distinct from those in the +1KG ONT 100 track (Gustafson et al. 2024); +the two releases were produced by separate consortia (Vienna and the 1000 Genomes +ONT Sequencing Consortium, respectively) and there is no sample overlap between +the two. +
Items are colored by SV class:
-Filters are available for SV class, insertion/deletion type, transposon family, +Filters are available for SV type, insertion/deletion type, transposon family, and SV length. For insertions, the item is placed at the insertion site with a width of 1 bp; for deletions, the item spans the deleted region.
The detail page for each item shows SVAN annotation fields including:
-Oxford Nanopore sequencing was performed on 1,019 samples from the 1000 Genomes -Project. Base-calling was done with Guppy 6.2.1. SVs were discovered using -the SAGA framework, which combines: -
-Variants were annotated with SVAN (SV Annotator v1.3), which leverages allelic -representations and genomic annotations to classify SVs by mechanism. SVAN -annotated 96.0% of insertions, 32.2% of deletions, and 57.1% of complex sites. +The SVAN-annotated unphased VCF (final-vcf.unphased.SVAN_1.3.vcf.gz) +was downloaded from + +the IGSR 1KG_ONT_VIENNA v1.1 SVAN-annotation directory; allele counts +were added from the companion shapeit5-phased-callset +(shapeit5-phased-callset_final-vcf.phased.vcf.gz) in the same +release tree.
-The original SV coordinates are on the T2T-CHM13 assembly (hs1). For the GRCh38 -(hg38) version of this track, coordinates were converted using liftOver; 148,375 -of 161,332 records mapped successfully (~92%). The hs1 version contains all -161,332 records at their native coordinates. +The step-by-step build commands (download, liftOver, format conversion, +bigBed build) are recorded in the UCSC makeDoc for this track container: + +doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.
Source data is available from the 1000 Genomes ONT Vienna data collection at IGSR.
Thanks to the 1000 Genomes ONT Vienna consortium for making their structural variant calls and SVAN annotations publicly available.