bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/kwanhoSv.html src/hg/makeDb/trackDb/human/kwanhoSv.html index 4c641c20175..76c2050aabd 100644 --- src/hg/makeDb/trackDb/human/kwanhoSv.html +++ src/hg/makeDb/trackDb/human/kwanhoSv.html @@ -47,41 +47,66 @@ which cohorts include at least one carrier.
  • Carrier rates: fraction of cases (PD+ILBD) and controls (HC) carrying the variant, and the case-minus-control differential.
  • Per-cohort AF / AC / AN: alternate allele frequency, alternate allele count, and total called alleles in PD, HC and ILBD samples.
  • Carrier lists: sample IDs carrying the variant in each cohort.
  • Nearby SNP context: number of SNPs nearby and the number in linkage disequilibrium with the SV (from the paper's LD analyses).
  • Read support: average mapping quality and average supporting reads per sample at the variant site.
  • Methods

    -Long-read whole-genome sequencing was performed on 100 post-mortem brain -samples (35 PD, 31 ILBD, 34 HC) with PacBio HiFi chemistry. Per-sample SV -calls from multiple callers were merged into a joint callset; the -high-confidence filtered catalog released in Supplementary Table 13 -(media-13.txt) of the Kim et al. 2026 preprint is used directly -here. Per-cohort allele frequencies, Hardy-Weinberg statistics and case / -control carrier rates are reported in the source table; the track exposes -the allele counts and the case-control differential as filterable fields. -The paper also integrates single-nucleus RNA-seq from two brain regions -of the same donors to test SV-expression associations in specific cell -types, but that layer is not shown in this track. +Kim et al. 2026 performed PacBio HiFi long-read whole-genome sequencing on +100 post-mortem cerebellum samples from the Arizona Study of Aging and +Neurodegenerative Disorders / Brain and Body Donation Program cohort +(35 Parkinson's disease, 31 incidental Lewy body disease, 34 healthy +controls). gDNA was isolated with either the Qiagen DNeasy or PacBio +Nanobind PanDNA kit, sheared on a Megaruptor 3 to 10-23.5 kb, built into +SMRTbell libraries (Prep Kit 3.0) and sequenced on PacBio Revio (25M +SMRT cells, 2-h pre-extension, 24-h movies) to ~17x per-sample coverage. +Reads were processed with the Broad long-read WDL pipelines (CCS v6.2.0, +pbmm2 v1.4.0 aligned to GRCh38, SAMtools v1.13 merge/sort) and an +ensemble of three callers was run per sample: Sniffles2 v2.0.6, + +PBSV v2.9.0 (with GRCh38 tandem-repeat context) and Cue2 v2.0.0 +(deep-learning image-based long-read caller). Per-caller VCFs were +FILTER-PASS / ≥40 bp filtered, split by SV type with BCFtools, and +merged by type across the 100 individuals and across the three callers +with +SURVIVOR v1.0.7 (1 kb distance, strand-match, min 50 bp). Centromere, +reference-gap, segmental-duplication and sex-chromosome SVs were excluded. +The high-confidence catalog contains 74,552 SVs (34,056 insertions, +29,545 deletions, 9,707 duplications and 1,244 inversions) released in +Supplementary Table 13 (media-13.txt), with per-cohort AF / AC / +AN, Hardy-Weinberg statistics and case/control carrier differentials. +

    +

    +The supplementary table media-13.txt was downloaded from the Kim +et al. 2026 bioRxiv preprint ( +doi:10.64898/2026.03.20.713192). +

    +

    +The step-by-step build commands (download, TSV parsing, bigBed build) are +recorded in the UCSC makeDoc for this track container: + +doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.

    Data Access

    The data can be explored interactively in table format with the Table Browser or the Data Integrator, and accessed programmatically through our API, track=kwanhoSv.

    The bigBed is available from our download server as kwanho.bb. Example: bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/kwanho.bb -chrom=chr21 -start=0 -end=100000000 stdout.