bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/decodeSv.html src/hg/makeDb/trackDb/human/decodeSv.html index c5ba7bf869d..e5940527f8c 100644 --- src/hg/makeDb/trackDb/human/decodeSv.html +++ src/hg/makeDb/trackDb/human/decodeSv.html @@ -1,94 +1,112 @@
This track shows high-confidence structural variants (SVs) identified by Oxford Nanopore long-read sequencing of 3,622 Icelanders recruited through the deCODE genetics population cohort. The release contains 133,886 SVs (55,649 deletions, 75,050 insertions and 3,187 combined insertion/deletion events). Variants are site-level (no per-sample genotypes) and have been filtered to a high-confidence subset validated in the accompanying population-scale analysis.
Note that this release does not include allele counts or allele frequencies: each row represents a site that was called with high confidence in the cohort, but the number of carrier samples is not provided, so the track cannot be filtered by AF/AC.
Items are colored by SV type:
Insertions are placed at the insertion site with a width of 1 bp; deletions span the deleted interval; INSDEL events span the affected reference region and have SVLEN=0 because the reference and alternate alleles differ in both sequence and length. Filters are available for SV type and SV length.
Where a variant falls inside an annotated tandem-repeat region, the detail page also shows the coordinates of that region (TRRBEGIN / TRREND from the source VCF), which can be useful context for repeat-mediated insertions and deletions.
-Oxford Nanopore whole-genome sequencing was performed on 3,622 Icelandic -participants enrolled through deCODE genetics. Reads were aligned to -GRCh38 and structural variants were called and merged across the cohort -following the pipeline described in Beyter et al. (2021), which combined -multiple callers and a joint reassessment of candidate variants against -the long reads. The high-confidence set released here corresponds to the -filtered callset with strong read support and consistent representation -across samples. +Beyter et al. 2021 performed Oxford Nanopore long-read sequencing of 3,622 +Icelanders recruited through deCODE genetics and detected a median of +22,636 SVs per individual (13,353 insertions and 9,474 deletions). Across +the cohort they derived a set of 133,886 reliably genotyped SV alleles, +imputed those alleles into 166,281 chip-typed Icelanders, and tested them +for association with disease and quantitative traits (notably including a +rare PCSK9 deletion associated with lower LDL-cholesterol and a +multi-allelic 57-bp VNTR in ACAN associated with adult height). The +track shown here displays the 133,886 high-confidence SV sites: 55,649 +deletions, 75,050 insertions and 3,187 combined insertion/deletion events. +The release is site-only (no per-sample genotypes or allele frequencies), +so the track cannot be filtered by AF/AC. +
++The VCF ont_sv_high_confidence_SVs.sorted.vcf.gz was downloaded +from the deCODE genetics + +LRS_SV_sets GitHub repository. +
++The step-by-step build commands (download, format conversion, bigBed build) +are recorded in the UCSC makeDoc for this track container: + +doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.
The data can be explored interactively in table format with the Table Browser or the Data Integrator and exported from there to spreadsheet or tab-sep tables. From scripts, the data can be accessed through our API, track=decodeSv.
The annotation is stored as a bigBed file that can be downloaded from our download server as decodeSv.bb. Individual regions or the whole annotation can be obtained with the bigBedToBed utility, available from our utilities page. Example: bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/decodeSv.bb -chrom=chr21 -start=0 -end=100000000 stdout.
The original VCF is available from the deCODE genetics LRS_SV_sets GitHub repository.
Thanks to the deCODE genetics team and the Icelandic study participants for making this dataset publicly available.
Beyter D, Ingimundardottir H, Oddsson A, Eggertsson HP, Bjornsson E, Jonsson H, Atlason BA, Kristmundsdottir S, Mehringer S, Hardarson MT et al. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat Genet. 2021 Jun;53(6):779-786. PMID: 33972781