f058c8fe4601b223ff47468eb3525c05ccd03850 max Wed Apr 22 09:17:17 2026 -0700 srSv: new short-read SV supertrack, split out of lrSv Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr, tommoJpCnv) out of the Long-read SV supertrack into a new sibling supertrack srSv (Short-read SVs), so the lrSv collection contains only long-read callsets. Filter fields (svType, svLen, insLen, AC) are mirrored at the srSv supertrack level to keep the UX parallel to lrSv. - trackDb: new human/srSv.ra with the three subtrack stanzas and updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas removed from human/lrSv.ra. human/trackDb.ra now includes srSv.ra. Also a new human/srSv.html overview page; the SR rows and SR-specific paragraphs removed from human/lrSv.html. - Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/ {lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to scripts/srSv/ with git mv (history preserved) and renamed to drop the "lrSv" prefix. Internal path references in abelSvBuild.sh and abelSvVcfToBed.py updated. - makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and extended with the onekg3202Sr and tommoJpCnv sections moved from lrSv.txt. lrSv.txt leaves a pointer. - Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr, lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*. /gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and /gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/. refs #36258 diff --git src/hg/makeDb/scripts/abelSv/build.sh src/hg/makeDb/scripts/srSv/abelSvBuild.sh similarity index 76% rename from src/hg/makeDb/scripts/abelSv/build.sh rename to src/hg/makeDb/scripts/srSv/abelSvBuild.sh index 0b7c7c7a1bd..f2ba59d357b 100755 --- src/hg/makeDb/scripts/abelSv/build.sh +++ src/hg/makeDb/scripts/srSv/abelSvBuild.sh @@ -1,51 +1,51 @@ #!/usr/bin/env bash # Build the abelSv bigBed for hg38 from the public Abel et al. 2020 CCDG # structural-variant VCFs (B38 native + B37 lifted to hg38). # -# Expects to be run from /hive/data/genomes/hg38/bed/abelSv/ (cwd) with -# Build38.public.v2.vcf.gz and Build37.public.v2.vcf.gz already downloaded -# (see makeDoc for download URLs). +# Expects to be run from /hive/data/genomes/hg38/bed/srSv/abelSv/ (cwd) +# with Build38.public.v2.vcf.gz and Build37.public.v2.vcf.gz already +# downloaded (see makeDoc for download URLs). set -euo pipefail -SCRIPTS=/cluster/home/max/kent/src/hg/makeDb/scripts/abelSv +SCRIPTS=/cluster/home/max/kent/src/hg/makeDb/scripts/srSv CHAIN=/gbdb/hg19/liftOver/hg19ToHg38.over.chain.gz CHROMSIZES=/hive/data/genomes/hg38/chrom.sizes NTHREADS=8 # --- B38: native GRCh38 --- echo "[$(date +%T)] processing B38 VCF..." zcat Build38.public.v2.vcf.gz \ - | "$SCRIPTS/vcfToBed.py" B38 \ + | "$SCRIPTS/abelSvVcfToBed.py" B38 \ > B38.bed # --- B37: lift to hg38 --- echo "[$(date +%T)] processing B37 VCF..." zcat Build37.public.v2.vcf.gz \ - | "$SCRIPTS/vcfToBed.py" B37 \ + | "$SCRIPTS/abelSvVcfToBed.py" B37 \ > B37.prelift.bed echo "[$(date +%T)] lifting B37 to hg38..." # bed has 35 columns; we use -bedPlus=9 -tab so liftOver passes extra # columns through unchanged. liftOver -tab -bedPlus=9 B37.prelift.bed "$CHAIN" \ B37lift.bed B37.unmapped.bed echo " B37 mapped: $(wc -l < B37lift.bed)" echo " B37 unmapped: $(grep -c -v '^#' B37.unmapped.bed || true)" # --- merge + sort --- echo "[$(date +%T)] merging and sorting..." cat B38.bed B37lift.bed \ | sort -k1,1 -k2,2n --parallel="$NTHREADS" \ > abelSv.bed echo " total variants: $(wc -l < abelSv.bed)" # --- bigBed --- echo "[$(date +%T)] building bigBed..." -bedToBigBed -type=bed9+26 -tab -as="$SCRIPTS/abelSv.as" \ +bedToBigBed -type=bed9+29 -tab -as="$SCRIPTS/abelSv.as" \ abelSv.bed "$CHROMSIZES" abelSv.bb ls -lh abelSv.bb echo "[$(date +%T)] done."