bac95a147f49cd331052e597006e04b3deee40fc max Wed Apr 22 10:43:20 2026 -0700 lrSv/srSv: human-readable SV type filter labels, script cleanups Add human-readable labels to the supertrack-level svType filter on both the lrSv and srSv supertracks using the "CODE|CODE (Long name)" filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)", etc. Labels keep the short code up front so users can match what hgTracks shows next to each feature. Also sweep in the in-progress converter/as-file cleanups under scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py helpers, consistent insLen / svLen / AC column naming, tightened field-description text) that had been piling up as an unstaged working tree. refs #36258 diff --git src/hg/makeDb/trackDb/human/abelSv.html src/hg/makeDb/trackDb/human/abelSv.html index 7d5913fffb5..858d071de16 100644 --- src/hg/makeDb/trackDb/human/abelSv.html +++ src/hg/makeDb/trackDb/human/abelSv.html @@ -69,45 +69,61 @@
  • Callset — B38 native, B37lift, or both.
  • Filter — PASS (high confidence) and/or LOW (low confidence, as flagged by the authors based on Mendelian-error rate).
  • Allele frequency (AF), Allele count (AC), SV length, and Mean sample quality (MSQ).
  • Per-population allele counts and numbers are shown on the details page for 8 ancestry groups: AFR (African), AMR (Latino/Admixed-American), NFE (non-Finnish European), FE (Finnish European), EAS (East-Asian), SAS (South-Asian), PI (Pacific Islander), and Other.

    Methods

    -The authors used their open-source -svtools pipeline to jointly call SVs across all samples. Per-sample -calls were produced with LUMPY (v0.2.13), CNVnator (v0.3.3), and svtyper -(v0.1.4); calls were merged across samples and refined with svtools. Low- -and high-confidence variants were distinguished using a Mendelian-error -cutoff on mean sample quality, calibrated against a set of 409 CEPH trios. -Per-sample validation was performed against a PacBio long-read truth set -derived from three HGSVC samples.

    +Abel et al. 2020 jointly called SVs from Illumina short-read sequencing +(mean coverage >20x) of 17,795 genomes from the NHGRI Centers for +Common Disease Genomics program with per-sample calls from LUMPY v0.2.13, +CNVnator v0.3.3 and svtyper v0.1.4, integrated across the cohort by the +svtools +pipeline. Low- and high-confidence variants were separated by a +Mendelian-error cutoff on mean sample quality, calibrated against 409 +CEPH trios, and per-sample calls were validated against a PacBio +long-read truth set from three HGSVC samples. Two non-overlapping +callsets were released: 458,106 SVs from 14,623 samples called natively +on GRCh38 (B38) and 279,892 SVs from 8,417 samples called on GRCh37 +(B37). The site-frequency callsets span DELs, DUPs, INVs, mobile-element +variants and breakends/translocations.

    -For this UCSC track, VCF INFO fields were parsed and converted to BED9+ -format. Variants originally called on GRCh37 (B37 callset) were lifted -to GRCh38 using the UCSC hg19ToHg38.over.chain.gz chain. See the +The B38 and B37 site-frequency VCFs (plus BEDPE companion files) were +downloaded from the authors' supplementary-data GitHub repository, + +github.com/hall-lab/sv_paper_042020. For the hg38 track, INFO fields +were parsed into BED9+ columns; B37 records were lifted to hg38 with the +UCSC hg19ToHg38.over.chain.gz chain (626 B37 records failed to +lift, leaving 737,998 SVs total in the track).

    + +

    +The step-by-step build commands (download, liftOver, format conversion, +bigBed build) are recorded in the UCSC makeDoc for this track: -track build documentation for full details.

    +doc/hg38/abelSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv. +

    Data Access

    The data can be explored interactively in table format with the Table Browser or the Data Integrator and exported from there to spreadsheet or tab-sep tables. From scripts, the data can be accessed through our API, track=abelSv.

    For automated download and analysis, the annotation is stored in a bigBed file that can be downloaded from our download server. The file for this track is called abelSv.bb. Individual regions or the whole genome annotation can