bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/chirmade101Sv.html src/hg/makeDb/trackDb/human/chirmade101Sv.html index 3f78a6c24dc..1ffe94e15a3 100644 --- src/hg/makeDb/trackDb/human/chirmade101Sv.html +++ src/hg/makeDb/trackDb/human/chirmade101Sv.html @@ -1,97 +1,119 @@

Description

This track shows structural variants (SVs) identified by long-read whole-genome sequencing of 101 individuals, released together with the GWAS SVatalog web tool described in Chirmade et al. 2026. GWAS SVatalog computes and visualizes linkage disequilibrium between these SVs and GWAS-associated SNPs so that investigators can assess whether a SNP association signal may be tagging an underlying SV.

The table contains 87,183 SVs (42,435 deletions, 41,734 insertions, 1,394 duplications, 912 inversions, 708 complex events). Each SV is annotated with gene overlaps, GC content, repeat context, ClinGen haploinsufficiency / triplosensitivity scores, gnomAD per-gene constraint metrics (pLI, LOEUF, missense O/E), OMIM phenotype associations, ClinVar variant IDs, and overlaps with DGV, Decipher and ClinGen regional annotations.

Display Conventions and Configuration

Items are colored by SV type:

Filters are available for SV type, SV length and the number of overlapping genes. The detail page shows the full annotation row: gene-level constraint scores (per overlapping gene), ClinGen / Decipher / ClinVar region matches, OMIM phenotype annotations and gnomAD SV frequencies at >=90% reciprocal overlap. Because most genomic regions carry no clinical annotation, many columns will be blank for an arbitrary SV.

Methods

-SVs were called from 101 long-read whole-genome sequencing samples and -annotated as described in Chirmade et al. 2026. The annotation table used -here (sv_annotations.tsv) is the companion data release for GWAS -SVatalog, available from the Zenodo record linked below. Coordinates in -the source TSV are 1-based closed and were converted to 0-based half-open -BED for this track. +Chirmade et al. 2026 called SVs from 101 whole-genome sequenced individuals +enrolled in the CF Canada-SickKids Program in Individualized Therapy +(CFIT), a predominantly-European cohort of people with cystic fibrosis. +Each sample was sequenced with two long-read / linked-read technologies: +PacBio continuous long reads on Sequel I (34 samples, 50x) or Sequel II +(67 samples, 76x), and 10X Genomics linked reads on Illumina HiSeq X at +~30x. SVs were called per sample with pbsv v2.2.2 (pbmm2 alignments) and +Sniffles v1.0.11 (NGMLR alignments) on the PacBio CLR data, and with Long +Ranger, CNVnator v0.4, ERDS v1.1 and Manta v1.6.0 on the 10XG data. +Per-platform and cross-platform calls were merged in three steps using a +50% reciprocal overlap rule (pbsv anchored, tagged by Sniffles on PacBio; +Manta anchored, augmented by CNVnator, ERDS and Long Ranger deletions on +10XG; then a cross-platform merge with PacBio coordinates preferred), and +SV records present in fewer than three participants were dropped. The +released catalog contains 87,183 SVs (42,435 deletions, 41,734 insertions, +1,394 duplications, 912 inversions and 708 complex events); the +pre-computed GWAS SVatalog LD analyses use a common-SV subset of 35,732 +sites against 116,870 GWAS-Catalog SNPs.

-Note that the SVatalog tool's pre-computed LD analyses use a common-SV -subset (35,732 sites); the underlying long-read callset released in this -TSV (87,183 SVs) is larger and includes rarer variants not used for LD -visualisation. +The annotation TSV sv_annotations.tsv was downloaded from the +Zenodo companion record, + +zenodo.org/records/13367574. Coordinates in the TSV are 1-based closed +and were converted to 0-based half-open BED for this track. +

+

+The step-by-step build commands (download, coordinate shift, format +conversion, bigBed build) are recorded in the UCSC makeDoc for this track +container: + +doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.

Data Access

The data can be explored interactively in table format with the Table Browser or the Data Integrator, and accessed programmatically through our API, track=chirmade101Sv.

The bigBed is available from our download server as chirmade101.bb. Example: bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/chirmade101.bb -chrom=chr21 -start=0 -end=100000000 stdout.

The original annotation table is available on Zenodo: zenodo.org/records/13367574. The GWAS SVatalog web tool itself is at svatalog.research.sickkids.ca.

Credits

Thanks to Chirmade, Strug and colleagues at The Hospital for Sick Children and the University of Toronto for releasing this annotated long-read SV callset alongside the GWAS SVatalog tool.

References

Chirmade S, Wang Z, Mastromatteo S, Sanders E, Thiruvahindrapuram B, Nalpathamkalam T, Pellecchia G, Lin F, Keenan K, Patel RV et al. GWAS SVatalog: a visualization tool to aid fine-mapping of GWAS loci with structural variations. Heredity (Edinb). 2026 Mar;135(3):199-210. PMID: 41203876; PMC: PMC13031531