7594507ca126d5242346787e42e13c52ea7709b1
max
  Fri Apr 17 08:40:31 2026 -0700
Add lrSv supertrack: long-read structural variants from 9 studies (hg38).

#Preview2 week - bugs introduced now will need a build patch to fix
Sub-tracks (all bigBed 9+):
han945Sv     - 945 Han Chinese, ONT (Gong 2025, PMID 39929826)
lrSv1kgOnt   - 1019 1000 Genomes, ONT, SVAN-annotated (Schloissnig 2025,
PMID 40702182; lifted from hs1)
tommoJpSv    - 333 Japanese (111 trios), ONT (Otsuki 2022, PMID 36127505)
aou1kSv      - 1027 All of Us, PacBio HiFi (Garimella 2025, PMID 41256123)
ga4kSv       - 502 GA4K pediatric rare disease, PacBio HiFi
(Cohen 2022, PMID 35305867)
decodeSv     - 3622 Icelanders, ONT (Beyter 2021, PMID 33972781)
hgsvc3Sv     - 65 HGSVC3 diverse haplotype-resolved assemblies, HiFi+ONT
(Logsdon 2025, PMID 40702183; merges insdel+inv tables)
kwanhoSv     - 100 post-mortem brains (PD/ILBD/HC), PacBio HiFi
(Kim 2026, PMID 41929179)
chirmade101Sv - 101 long-read WGS GWAS SVatalog cohort
(Chirmade 2026, PMID 41203876)

Includes per-track conversion scripts and autoSql under
scripts/lrSv/, the supertrack summary table in lrSv.html, and a
consolidated makeDoc at doc/hg38/lrSv.txt.

refs #36258

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

diff --git src/hg/makeDb/trackDb/human/hgsvc3Sv.html src/hg/makeDb/trackDb/human/hgsvc3Sv.html
new file mode 100644
index 00000000000..e13dc55629b
--- /dev/null
+++ src/hg/makeDb/trackDb/human/hgsvc3Sv.html
@@ -0,0 +1,115 @@
+<h2>Description</h2>
+<p>
+This track shows structural variants (SVs) from the third phase of the
+Human Genome Structural Variation Consortium (HGSVC3). The callset comes
+from 65 diverse individuals across five continental groups, each sequenced
+with PacBio HiFi (~47x), Oxford Nanopore ultra-long reads (~56x) and
+complemented with Strand-seq, optical mapping, Hi-C and Iso-Seq for
+haplotype-resolved assembly. SVs were discovered from the de novo assemblies
+with PAV v2.4.0.1 and cross-validated by ten additional orthogonal callers.
+</p>
+<p>
+The track merges the two final SV annotation tables from the HGSVC3 v1.0
+release on GRCh38: 176,232 insertions/deletions and 300 inversions, for a
+total of 176,532 SVs. Each row is a site-level variant with the list of
+carrier haplotypes and additional structural annotations.
+</p>
+
+<h2>Display Conventions and Configuration</h2>
+<p>
+Items are colored by SV type:
+<ul>
+<li><span style="color: rgb(200,0,0);">Deletions (DEL)</span> - red</li>
+<li><span style="color: rgb(0,0,200);">Insertions (INS)</span> - blue</li>
+<li><span style="color: rgb(230,140,0);">Inversions (INV)</span> - orange</li>
+</ul>
+</p>
+<p>
+Insertions are placed at the insertion site with a width of 1 bp; deletions
+and inversions span the affected reference interval. Filters are available
+for SV type, SV length, carrier-haplotype count, distinct sample count,
+whether the site falls in a Tandem Repeat Finder region and the fraction
+of the variant overlapping segmental duplications.
+</p>
+<p>
+The detail page shows, where available:
+<ul>
+<li><b>Allele / Sample Count</b>: number of carrier haplotypes (out of the
+2*65 = 130 phased haplotypes plus unphased "un" entries) and the number of
+distinct samples carrying the variant.</li>
+<li><b>Reference / Contig Homology</b>: microhomology length (5',3') at the
+breakpoints in the reference and in the assembly contig (insertions and
+deletions only).</li>
+<li><b>Inner Inversion Region</b>: for inversions, the coordinate range of
+the inner inverted sequence, distinct from the outer breakpoint interval.</li>
+<li><b>Transposable Element</b>: when the inserted or deleted sequence was
+classified as a known TE family.</li>
+<li><b>Segmental Duplication Overlap</b>: fraction of the variant interval
+overlapping UCSC segmental duplications in the reference.</li>
+<li><b>Carrier Haplotypes</b>: full list of haplotype IDs (e.g.
+<tt>HG00096-h1</tt>, <tt>HG00096-h2</tt>, <tt>HG00514-un</tt>) carrying the
+variant.</li>
+</ul>
+</p>
+
+<h2>Methods</h2>
+<p>
+HGSVC3 produced haplotype-resolved de novo assemblies for 65 samples
+spanning five continental groups. Assemblies were built from PacBio HiFi
+and Oxford Nanopore reads, phased with Strand-seq and further validated
+with Hi-C and optical mapping. Structural variants were called by aligning
+each haplotype back to the reference with PAV v2.4.0.1; calls were then
+cross-referenced with ten independent callers. The final annotation tables
+(this track's input) include merge statistics (MERGE_RO, MERGE_OFFSET,
+MERGE_SZRO, MERGE_OFFSZ, MERGE_MATCH) that describe how well each
+per-sample call matched the merged consensus site.
+</p>
+<p>
+Two tables were merged for display here:
+<tt>variants_GRCh38_sv_insdel_HGSVC2024v1.0.tsv.gz</tt> (DEL + INS, 176,232
+records) and <tt>variants_GRCh38_sv_inv_HGSVC2024v1.0.tsv.gz</tt> (INV, 300
+records). Type-specific columns (HOM_REF/HOM_TIG/TE for insdel;
+RGN_REF_INNER for inversions) are shown as empty on the detail page when
+they do not apply.
+</p>
+
+<h2>Data Access</h2>
+<p>
+The data can be explored interactively in table format with the
+<a href="../cgi-bin/hgTables">Table Browser</a> or the
+<a href="../cgi-bin/hgIntegrator">Data Integrator</a>, and accessed
+programmatically through our <a href="https://api.genome.ucsc.edu">API</a>,
+track=<i>hgsvc3Sv</i>.
+</p>
+<p>
+The bigBed is available from
+<a href="http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/" target="_blank">our
+download server</a> as <tt>hgsvc3.bb</tt>. Example:
+<tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/hgsvc3.bb -chrom=chr21 -start=0 -end=100000000 stdout</tt>.
+</p>
+<p>
+The original annotation tables are available from the
+<a href="https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGSVC3/release/Variant_Calls/1.0/GRCh38/annotation_table/" target="_blank">
+HGSVC3 release</a> on the IGSR FTP site.
+</p>
+
+<h2>Credits</h2>
+<p>
+Thanks to the Human Genome Structural Variation Consortium (HGSVC) and all
+participating sequencing and analysis centers for making the HGSVC3
+annotation tables publicly available.
+</p>
+
+<h2>References</h2>
+
+
+<p>
+Logsdon GA, Ebert P, Audano PA, Loftus M, Porubsky D, Ebler J, Yilmaz F, Hallast P, Prodanov T, Yoo
+D <em>et al</em>.
+<a href="https://doi.org/10.1038/s41586-025-09140-6" target="_blank">
+Complex genetic variation in nearly complete human genomes</a>.
+<em>Nature</em>. 2025 Aug;644(8076):430-441.
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/40702183" target="_blank">40702183</a>; PMC: <a
+href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12350169/" target="_blank">PMC12350169</a>
+</p>
+