src/hg/makeDb/trackDb/human/lrSvAll.ra 9fbdfa3416ffde377072fafd2de44059155c3b44

9fbdfa3416ffde377072fafd2de44059155c3b44
max
  Thu Apr 30 06:57:35 2026 -0700
lrSv: add lrSvAll merged track combining all long-read SV subtracks

Variants are merged on exact (chrom, start, end, svType, svLen, insLen).
Per-database AC columns are stored as strings; "unknown" is used where
the source dataset has only placeholder AC values (deCODE, SVatalog 101,
1KG ONT 100). Kim PD Brain is split into affected (PD+ILBD) and healthy
(HC) AC columns. Gustafson contributes sampleCount instead of AC.

Output: 2,694,871 unique SVs from 3,706,100 input rows across 15
subtracks (27% dedup). The merged track sits as the first subtrack of
the lrSv supertrack with filters on sources, svType, svLen, insLen,
maxAF/minAF, AC, and sourceCount.

The trackDb stanza is generated by the build script directly into
human/lrSvAll.ra and pulled in via 'include lrSvAll.ra' from lrSv.ra,
so labels in databases.tsv stay the single source of truth.

lrSv.html: add a "Disease cases" column to the dataset summary,
strip parenthesized internal track names from the section headers,
and shorten exact SV counts to ~Nk / ~N.NM in the prose.

refs #36642

diff --git src/hg/makeDb/trackDb/human/lrSvAll.ra src/hg/makeDb/trackDb/human/lrSvAll.ra
new file mode 100644
index 00000000000..aa7d9f79a29
--- /dev/null
+++ src/hg/makeDb/trackDb/human/lrSvAll.ra
@@ -0,0 +1,41 @@
+# AUTO-GENERATED by ~/kent/src/hg/makeDb/scripts/lrSv/lrSvMergeAll.py
+# Do not edit by hand - re-run the merge script and re-commit.
+
+    track lrSvAll
+    parent lrSv
+    bigDataUrl /gbdb/$D/lrSv/lrSvAll.bb
+    shortLabel All LR SVs merged
+    longLabel All long-read SVs merged across the lrSv subtracks (exact-position match), with per-database AC
+    type bigBed 9 +
+    itemRgb on
+    visibility pack
+    mouseOver <b>$name</b> ($svType) svLen=$svLen insLen=$insLen sources=$sources AF=$minAF-$maxAF AC=$AC
+    searchIndex name
+    filterValues.sources CoLoRSdb|CoLoRSdb 1427 (PacBio),1000G-ONT-Vienna|1KG ONT Vienna 1019,1000G-ONT|1KG ONT 100 (Gustafson),AoU1K|All of Us 1027 (PacBio),Han945|Han Chinese 945,TommoJapan|ToMMo 333 (Japanese),GA4K|GA4K 502 (rare disease),deCODE|deCODE 3622 (Icelandic),HPRCv2|HPRC v2 233,HGSVC2|HGSVC2 32,HGSVC3|HGSVC3 65,KimPD|Kim PD Brain 100,ArabUAE53|Arab APR 53,China58|CPC 58 (Chinese),Svatalog101|SVatalog 101
+    filterType.sources multipleListOr
+    filterLabel.sources Source Database
+    filterValues.svType DEL,INS,DUP,INV,CPX,MIXED,INSDEL,CNV,BND,TRA,MEI
+    filterType.svType multipleListOr
+    filterLabel.svType SV Type
+    filter.svLen 0:30000000
+    filterByRange.svLen on
+    filterLabel.svLen SV Length (bp)
+    filter.insLen 0:600000
+    filterByRange.insLen on
+    filterLabel.insLen Insertion Length (bp)
+    filter.maxAF 0:1
+    filterByRange.maxAF on
+    filterLimits.maxAF 0:1
+    filterLabel.maxAF Max Allele Frequency (across DBs)
+    filter.minAF 0:1
+    filterByRange.minAF on
+    filterLimits.minAF 0:1
+    filterLabel.minAF Min Allele Frequency (across DBs)
+    filter.AC 0:30000
+    filterByRange.AC on
+    filterLabel.AC Total AC (across DBs)
+    filter.sourceCount 1:15
+    filterByRange.sourceCount on
+    filterLabel.sourceCount Number of Source Databases
+    skipEmptyFields on
+    priority 0