7594507ca126d5242346787e42e13c52ea7709b1 max Fri Apr 17 08:40:31 2026 -0700 Add lrSv supertrack: long-read structural variants from 9 studies (hg38). #Preview2 week - bugs introduced now will need a build patch to fix Sub-tracks (all bigBed 9+): han945Sv - 945 Han Chinese, ONT (Gong 2025, PMID 39929826) lrSv1kgOnt - 1019 1000 Genomes, ONT, SVAN-annotated (Schloissnig 2025, PMID 40702182; lifted from hs1) tommoJpSv - 333 Japanese (111 trios), ONT (Otsuki 2022, PMID 36127505) aou1kSv - 1027 All of Us, PacBio HiFi (Garimella 2025, PMID 41256123) ga4kSv - 502 GA4K pediatric rare disease, PacBio HiFi (Cohen 2022, PMID 35305867) decodeSv - 3622 Icelanders, ONT (Beyter 2021, PMID 33972781) hgsvc3Sv - 65 HGSVC3 diverse haplotype-resolved assemblies, HiFi+ONT (Logsdon 2025, PMID 40702183; merges insdel+inv tables) kwanhoSv - 100 post-mortem brains (PD/ILBD/HC), PacBio HiFi (Kim 2026, PMID 41929179) chirmade101Sv - 101 long-read WGS GWAS SVatalog cohort (Chirmade 2026, PMID 41203876) Includes per-track conversion scripts and autoSql under scripts/lrSv/, the supertrack summary table in lrSv.html, and a consolidated makeDoc at doc/hg38/lrSv.txt. refs #36258 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> diff --git src/hg/makeDb/trackDb/human/decodeSv.html src/hg/makeDb/trackDb/human/decodeSv.html new file mode 100644 index 00000000000..c5ba7bf869d --- /dev/null +++ src/hg/makeDb/trackDb/human/decodeSv.html @@ -0,0 +1,94 @@ +<h2>Description</h2> +<p> +This track shows high-confidence structural variants (SVs) identified by +Oxford Nanopore long-read sequencing of 3,622 Icelanders recruited through +the deCODE genetics population cohort. The release contains 133,886 SVs +(55,649 deletions, 75,050 insertions and 3,187 combined insertion/deletion +events). Variants are site-level (no per-sample genotypes) and have been +filtered to a high-confidence subset validated in the accompanying +population-scale analysis. +</p> +<p> +Note that this release does not include allele counts or allele frequencies: +each row represents a site that was called with high confidence in the +cohort, but the number of carrier samples is not provided, so the track +cannot be filtered by AF/AC. +</p> + +<h2>Display Conventions and Configuration</h2> +<p> +Items are colored by SV type: +<ul> +<li><span style="color: rgb(200,0,0);">Deletions (DEL)</span> - red</li> +<li><span style="color: rgb(0,0,200);">Insertions (INS)</span> - blue</li> +<li><span style="color: rgb(140,0,200);">Combined insertion/deletion (INSDEL)</span> - purple</li> +</ul> +</p> +<p> +Insertions are placed at the insertion site with a width of 1 bp; deletions +span the deleted interval; INSDEL events span the affected reference region +and have SVLEN=0 because the reference and alternate alleles differ in both +sequence and length. Filters are available for SV type and SV length. +</p> +<p> +Where a variant falls inside an annotated tandem-repeat region, the detail +page also shows the coordinates of that region (TRRBEGIN / TRREND from the +source VCF), which can be useful context for repeat-mediated insertions and +deletions. +</p> + +<h2>Methods</h2> +<p> +Oxford Nanopore whole-genome sequencing was performed on 3,622 Icelandic +participants enrolled through deCODE genetics. Reads were aligned to +GRCh38 and structural variants were called and merged across the cohort +following the pipeline described in Beyter et al. (2021), which combined +multiple callers and a joint reassessment of candidate variants against +the long reads. The high-confidence set released here corresponds to the +filtered callset with strong read support and consistent representation +across samples. +</p> + +<h2>Data Access</h2> +<p> +The data can be explored interactively in table format with the +<a href="../cgi-bin/hgTables">Table Browser</a> or the +<a href="../cgi-bin/hgIntegrator">Data Integrator</a> and exported from there +to spreadsheet or tab-sep tables. From scripts, the data can be accessed +through our <a href="https://api.genome.ucsc.edu">API</a>, track=<i>decodeSv</i>. +</p> +<p> +The annotation is stored as a bigBed file that can be downloaded from +<a href="http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/" target="_blank">our +download server</a> as <tt>decodeSv.bb</tt>. Individual regions or the whole +annotation can be obtained with the <tt>bigBedToBed</tt> utility, available +from our +<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">utilities +page</a>. Example: +<tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/decodeSv.bb -chrom=chr21 -start=0 -end=100000000 stdout</tt>. +</p> +<p> +The original VCF is available from the deCODE genetics +<a href="https://github.com/DecodeGenetics/LRS_SV_sets" target="_blank">LRS_SV_sets</a> +GitHub repository. +</p> + +<h2>Credits</h2> +<p> +Thanks to the deCODE genetics team and the Icelandic study participants for +making this dataset publicly available. +</p> + +<h2>References</h2> + + +<p> +Beyter D, Ingimundardottir H, Oddsson A, Eggertsson HP, Bjornsson E, Jonsson H, Atlason BA, +Kristmundsdottir S, Mehringer S, Hardarson MT <em>et al</em>. +<a href="https://doi.org/10.1038/s41588-021-00865-4" target="_blank"> +Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in +human diseases and other traits</a>. +<em>Nat Genet</em>. 2021 Jun;53(6):779-786. +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/33972781" target="_blank">33972781</a> +</p> +