b85c12cf9af0ee1a954b8cced961bcbd909b7979
lrnassar
  Wed Apr 29 12:04:36 2026 -0700
Expand dbVar tracks to expose all six nstd186 source studies and add new Somatic and Other composites. refs #37406

Restructure the dbVar supertrack:
- Renamed from "dbVar Common Struct Var" to "dbVar Struct Var".
- dbVar Common SV: added subtracks for Lee, Abel, and Byrska-Bishop (the
three nstd186 source studies that were missing from our Curated Common
track), and for the American/East Asian/South Asian/Other populations.
- dbVar Conflict SV: description page refreshed; subtrack longLabel
clarified.
- dbVar Somatic SV (new): single subtrack pulling somatic_sv.bb from the
dbVar hub. Default hidden.
- dbVar Other SV (new): residual bucket for dbVar SVs not classified as
common, somatic, or clinical, split into Healthy and Phenotype subtracks.
Default hidden. NCBI sometimes calls this "presumed normal"; the
description page notes the equivalence. mergeSpannedItems on for the
dense subtracks (normal_healthy ~5.6M items, normal_phenotype ~410K,
somatic_sv ~67K).
- ClinVar SVs are not duplicated; description pages cross-link to the
existing ClinVar track instead.

Description pages: rewrite dbVarCommon.html and dbVarCurated.html, refresh
dbVarConflict.html, add dbVarSomatic.html and dbVarOther.html. Retire the
unused dbVar_common.html. Methods links now point at NCBI's dbVar Overview
rather than the FTP directory listing. searchTable termRegex widened to
^[den]ssv[0-9]+ so dssv* accessions in normal_healthy resolve.

Otto: stage downloads to release/\${db}.new/, validate per file (size
floor and 10% itemCount delta vs the current live copy), then atomically
swap via directory rename with a one-cycle .prev rollback. On validation
failure, leave .new/ in place for human inspection and exit non-zero so
the wrapper emails. On no-op runs the wrapper now stays silent.
checkNstd175.sh's "update done" message moved inside the update branch so
silence is honoured. New-file detection (via a knownFiles.txt manifest)
emails when NCBI adds a file we don't yet expose. knownFiles.txt itself
lives only at the deployment path under /hive/data/outside/otto/dbVar/,
not in the tree.

diff --git src/hg/utils/otto/dbVar/checkNstd175.sh src/hg/utils/otto/dbVar/checkNstd175.sh
index 67846c9d3f6..af9ce5d9b0e 100755
--- src/hg/utils/otto/dbVar/checkNstd175.sh
+++ src/hg/utils/otto/dbVar/checkNstd175.sh
@@ -1,21 +1,23 @@
 #!/bin/bash
 
 #	Do not modify this script, modify the source tree copy:
-#	src/hg/utils/dbVar/checkNstd175.sh
+#	src/hg/utils/otto/dbVar/checkNstd175.sh
+#	The source tree copy is installed to $WORKDIR via the makefile
+#	in the same directory (make install).
 
-set -beEu -o pipefail
+set -eEu -o pipefail
 WORKDIR=$1
 today=`date +%F`
 
 cleanUpOnError () {
     echo "dbVar nstd175 build failed"
 } 
 
 trap cleanUpOnError ERR
 trap "cleanUpOnError; exit 1" SIGINT SIGTERM
 umask 002
 
 mkdir -p ${WORKDIR}/${today}/giab
 cd ${WORKDIR}/${today}/giab
 rm -f ftp.giab.rsp
 echo "user anonymous otto@soe.ucsc.edu
@@ -64,17 +66,17 @@
         if [ ${grc} == "GRCh37" ]; then
             db="hg19"
         else
             db="hg38"
         fi
         mkdir -p ${db}
         pushd ${db} > /dev/null
         echo "processing nstd175 for ${db}"
         hgsql -Ne 'select 0, ca.alias, size, ca.chrom, size from chromInfo ci join chromAlias ca on ci.chrom = ca.chrom where source = "refseq"' ${db} > ${db}.lift
         wget -N -q "ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_study/gvf/nstd175.${grc}.variant*.gvf.gz"
         zcat nstd175.${grc}.* | ../../../processNstd175.py stdin ${db}.lift | sort -k1,1 -k2,2n | bedClip -truncate stdin /hive/data/genomes/${db}/chrom.sizes stdout > giabSv.bed
         bedToBigBed -type=bed9+11 -as=../../../giabSv.as -tab giabSv.bed /hive/data/genomes/${db}/chrom.sizes giabSv.bb
         cp giabSv.bb ${WORKDIR}/release/${db}/giabSv.bb
         popd > /dev/null
     done
-fi
     echo "dbVar nstd175 update done"
+fi