9eb4e0937782954c19d664e7d384d210bffb3b25 max Sat Jun 13 16:01:42 2026 -0700 lrSv: QA fixes from Lou's review - dedup, shared color palette, deCODE/AoU cleanup - Drop kwanhoSv (KimPD) from the lrSvAll merge in databases.tsv; it stays on dev/alpha until published, which also removes its >5 Mb breakend artifacts from the merged track. - Remove searchIndex from colorsDbSv, lrSv1kLin and lrSvAll (and the merge generator): the bigBeds were built without a name index, so by-name search never worked. - Single shared per-SV-type color palette in lrSvCommon.py (svColor), used by every converter and the merge. CPX is purple everywhere (was orange in 1kgOnt/apr/cpc1, colliding with INV's orange), colorsDb DEL is 200,0,0 like the rest, and TRA/INSDEL get their own colors. - deCODE: drop byte-identical duplicate rows and blank the fake AC=50 placeholder (AC is now a string field, omitted from the name and mouseOver). - AoU: numeric-entity-encode non-ASCII gene/trait text and drop duplicate rows. - gustafson, chirmade101, hprc2v21: drop byte-identical duplicate rows. - lrSvMergeAll.py: skip byte-identical duplicate source rows instead of summing their allele counts, which had inflated the per-database and total AC. refs #36258 diff --git src/hg/makeDb/trackDb/human/lrSvAll.ra src/hg/makeDb/trackDb/human/lrSvAll.ra index c3311db9d3d..cd00dc39968 100644 --- src/hg/makeDb/trackDb/human/lrSvAll.ra +++ src/hg/makeDb/trackDb/human/lrSvAll.ra @@ -1,41 +1,40 @@ # AUTO-GENERATED by ~/kent/src/hg/makeDb/scripts/lrSv/lrSvMergeAll.py # Do not edit by hand - re-run the merge script and re-commit. track lrSvAll parent lrSv bigDataUrl /gbdb/$D/lrSv/lrSvAll.bb shortLabel All LR SVs merged longLabel All long-read SVs merged across subtracks by exact position, with per-database AC type bigBed 9 + itemRgb on visibility pack mouseOver $name ($svType) svLen=$svLen insLen=$insLen sources=$sources AF=$minAF-$maxAF AC=$AC - searchIndex name - filterValues.sources CoLoRSdb|CoLoRSdb 1427 (PacBio),1000G-ONT-Vienna|1KG ONT Vienna 1019,1000G-ONT|1KG ONT 100 (Gustafson),AoU1K|All of Us 1027 (PacBio),Han945|Han Chinese 945,TommoJapan|ToMMo 333 (Japanese),GA4K|GA4K 502 (rare disease),deCODE|deCODE 3622 (Icelandic),HPRCv2.1|HPRC v2.1 233,HGSVC2|HGSVC2 32,HGSVC3|HGSVC3 65,KimPD|Kim PD Brain 100,ArabUAE53|Arab APR 53,China58|CPC 58 (Chinese),Svatalog101|SVatalog 101 + filterValues.sources CoLoRSdb|CoLoRSdb 1427 (PacBio),1000G-ONT-Vienna|1KG ONT Vienna 1019,1000G-ONT|1KG ONT 100 (Gustafson),AoU1K|All of Us 1027 (PacBio),Han945|Han Chinese 945,TommoJapan|ToMMo 333 (Japanese),GA4K|GA4K 502 (rare disease),deCODE|deCODE 3622 (Icelandic),HPRCv2.1|HPRC v2.1 233,HGSVC2|HGSVC2 32,HGSVC3|HGSVC3 65,ArabUAE53|Arab APR 53,China58|CPC 58 (Chinese),Svatalog101|SVatalog 101 filterType.sources multipleListOr filterLabel.sources Source Database filterValues.svType DEL,INS,DUP,INV,CPX,MIXED,INSDEL,CNV,BND,TRA,MEI filterType.svType multipleListOr filterLabel.svType SV Type filter.svLen 0:30000000 filterByRange.svLen on filterLabel.svLen SV Length (bp) filter.insLen 0:600000 filterByRange.insLen on filterLabel.insLen Insertion Length (bp) filter.maxAF 0:1 filterByRange.maxAF on filterLimits.maxAF 0:1 filterLabel.maxAF Max Allele Frequency (across DBs) filter.minAF 0:1 filterByRange.minAF on filterLimits.minAF 0:1 filterLabel.minAF Min Allele Frequency (across DBs) filter.AC 0:30000 filterByRange.AC on filterLabel.AC Total AC (across DBs) - filter.sourceCount 1:15 + filter.sourceCount 1:14 filterByRange.sourceCount on filterLabel.sourceCount Number of Source Databases skipEmptyFields on priority 0