349adcd7c26d73b53ec1865a4fd14c60b74f9386
braney
  Wed May 13 09:18:36 2026 -0700
trackDbCache: CACHE_TRACK_DB_DIR env var override; add trackDbCacheBench, refs #37551

trackDbCacheOn() in src/hg/lib/trackDbCache.c now reads
CACHE_TRACK_DB_DIR from the environment ahead of the cacheTrackDbDir
hg.conf setting. When the env var is set its value wins, including
the empty string (which disables the cache). This lets a benchmark
harness switch caching on and off per hgTracks invocation without
editing hg.conf.

trackDbCacheBench (src/utils/qa/trackDbCacheBench/) drives hgTracks
through cached and uncached runs, with warmups and per-iteration
median/min/max timings, and an --evict-cache option that uses
posix_fadvise(DONTNEED) to drop cache files from the OS page cache
between iterations so disk-backed cache directories can be compared
to /dev/shm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

diff --git src/utils/qa/trackDbCacheBench/README.md src/utils/qa/trackDbCacheBench/README.md
new file mode 100644
index 00000000000..b10b70f3149
--- /dev/null
+++ src/utils/qa/trackDbCacheBench/README.md
@@ -0,0 +1,60 @@
+# trackDbCacheBench
+
+Benchmark `hgTracks` rendering with and without the trackDb cache, and
+compare different cache directories (e.g. tmpfs vs disk-backed).
+
+## What it measures
+
+For each cache mode (`cached`, `no-cache`) the script does `--warmup`
+discarded `hgTracks` runs followed by `--iterations` timed runs. Per run
+it captures:
+
+- **wall time** measured around the `hgTracks` subprocess
+- **internal time** parsed from the `<span class='timing'>Overall total time:
+  NNN millis</span>` footer when `measureTiming=1` is in the CGI args (the
+  default scenario sets this)
+
+Results print as min / median / mean / max per mode, plus a
+`cached/no-cache` ratio.
+
+## How the cache is toggled
+
+The script sets `CACHE_TRACK_DB_DIR` in the child environment. The matching
+piece of C is `trackDbCacheOn()` in `src/hg/lib/trackDbCache.c`: if that
+environment variable is set (even to the empty string), its value
+overrides the `cacheTrackDbDir` setting in `hg.conf`. Empty value disables
+caching for that run; a non-empty value names the cache directory.
+
+## hg.conf
+
+By default the script picks up the `hg.conf` next to the `hgTracks` binary
+(typically `/usr/local/apache/cgi-bin-$USER/hg.conf`), which is what apache
+actually serves. Override with `--hg-conf`. Letting it default to your
+personal `~/.hg.conf` is usually a bad idea -- if that points
+`db.trackDb` at a personal trackDb table the cold trackDb load can be
+orders of magnitude slower than what production sees.
+
+## Usage
+
+    trackDbCacheBench.py CACHE_DIR [options]
+
+Common runs:
+
+    # default scenario: hg38, chr1:1-1000000, hgt.trackImgOnly=1
+    trackDbCacheBench.py /dev/shm/myCache
+
+    # cached mode only, with the cache forcibly emptied first (cold rebuild)
+    trackDbCacheBench.py /dev/shm/myCache --mode cached --clear-cache-before
+
+    # custom CGI scenario
+    trackDbCacheBench.py /dev/shm/myCache \
+        --cgi db=hg38 --cgi position=chr12:5000000-7000000 \
+        --cgi hgt.trackImgOnly=1 --cgi measureTiming=1
+
+    # cold-from-disk: fsync + posix_fadvise(DONTNEED) on cache files before
+    # each iteration; isolates real I/O cost of the cache directory
+    trackDbCacheBench.py /data/myCache --evict-cache --mode cached
+
+`--evict-cache` only meaningfully affects disk-backed cache directories.
+On tmpfs (`/dev/shm`) it is essentially a no-op because tmpfs is the page
+cache -- there is no underlying storage to fault back in from.