cfd2b848b9ee185619c7e677269329964d147327
braney
  Mon Apr 27 16:10:16 2026 -0700
quickLiftBench: switch to saved-session comparisons, refs #37445

Replace the per-track URL-builder schema (db/track/hubUrl + positions
list) with saved-session refs of the form `user/sessionName`. Each
variant loads a saved session and renders at the session's own saved
region; native and quickLifted sessions live on different assemblies,
so identical chr:start-end ranges would not be biologically equivalent.

Drop hgt.reset=1 from the URL: it re-applied default-track visibility,
defeating hideTracks=1. Cart isolation comes from a fresh
requests.Session() per case (new hgsid -> fresh cart).

Headline metric is total_ms, parsed from the "Overall total time"
footer span; load_ms_sum and draw_ms_sum are summed across per-track
rows. Add bench1_hgwdev and bench1_rr cases pointing at the first real
native/lifted session pair (Brianraney/benchQuickNative1 vs
benchQuickList1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

diff --git src/utils/qa/quickLiftBench/README.md src/utils/qa/quickLiftBench/README.md
index 81863b418b6..5c6e8fadb92 100644
--- src/utils/qa/quickLiftBench/README.md
+++ src/utils/qa/quickLiftBench/README.md
@@ -1,147 +1,153 @@
 # quickLiftBench
 
-Benchmark suite that compares hgTracks render times for quickLifted tracks
-against their non-lifted counterparts. Output TSVs are intended as the raw
-numbers behind tables and figures in a quickLift performance paper.
+Benchmark suite that compares hgTracks render times for two saved sessions on
+the same server. The intended pairing is a **native** session (tracks rendered
+on their source assembly) against a **lifted** session (the same tracks
+rendered on a different assembly via quickLift). Output TSVs are intended as
+the raw numbers behind tables and figures in a quickLift performance paper.
 
 ## What it measures
 
 For each benchmark case, two (or more) named variants — typically `native`
-and `lifted` — are timed at multiple genomic positions across multiple
-iterations. Each request goes to hgTracks with `?measureTiming=1`; the
-response is parsed for:
-
-- per-track `loadTime` and `drawTime` from `printTrackTiming()` in
-  `hg/hgTracks/hgTracks.c`
-- phase timings from `<span class='timing'>` markers (chromAliasSetup, etc.)
-- HTTP wall time around the request
-
-Each (variant, position) cell does `warmup` discarded requests followed by
-`iterations` recorded requests. Min / median / p90 are reported.
-
-## Three comparison modes (one schema)
-
-- **Mode A — same hub, source vs dest db.** One bigBed referenced by two
-  trackDb stanzas: native on the source assembly, quickLift'd on the
-  destination. Same data, two render paths.
-- **Mode B — track pair on each assembly.** Two distinct trackDb stanzas
-  holding equivalent data, one native on assembly A, one quickLift'd on
-  assembly B.
-- **Mode C — lift on/off, same trackDb.** Side-by-side reference hub with
-  two stanzas pointing at the same bigBed file; one with
-  `quickLiftUrl`/`quickLiftDb`, one without. (quickLift activation is gated
-  on the trackDb setting, with no cart override, so the hub-side toggle is
-  the cleanest way to compare.)
+and `lifted` — are timed across multiple iterations. Each request loads a
+saved session into a fresh cart and asks hgTracks for the per-request timing
+breakdown. The session renders at the position it was saved with; the runner
+does **not** override `position`, since native and quickLifted variants live
+on different assemblies and the same chr:start-end is not biologically
+equivalent across them. To benchmark multiple regions, save additional
+session pairs and add them as separate cases.
+
+Each response is parsed for:
+
+- **Overall total time** — the headline number, taken from the
+  `<span class='timing'>Overall total time: NNN millis</span>` footer span.
+- **Per-track load and draw times** — summed across all visible tracks from
+  the `printTrackTiming()` table emitted into a `<span class='trackTiming'>`
+  block.
+- **HTTP wall time** — measured around the request itself.
+
+Each variant cell does `warmup` discarded requests followed by `iterations`
+recorded requests. Min / median / p90 are reported.
 
 ## Usage
 
 ```
 ./quickLiftBench.py [--config FILE] [--cases ID,ID]
                     [--server-override NAME]
                     [--iterations N] [--warmup N]
                     [--out DIR] [--verbose]
 ```
 
 Defaults: read `cases.yaml` next to the script, no server override, all
 cases, iterations and warmup from `defaults`, output to
 `./results/<timestamp>/`.
 
 Examples:
 
 ```
-# Run everything against hgwdev with the defaults from cases.yaml:
+# Run everything against the server in each case stanza:
 ./quickLiftBench.py
 
 # One case, against the sandbox, 10 iterations:
-./quickLiftBench.py --cases example_modeA_bigBed \
+./quickLiftBench.py --cases bench1_hgwdev \
                     --server-override sandbox --iterations 10
 
-# Quick smoke test:
-./quickLiftBench.py --cases example_modeA_bigBed --iterations 1 --warmup 0 -v
+# Quick smoke against a single existing saved session:
+./quickLiftBench.py --cases smoke_session --iterations 1 --warmup 0 -v
 ```
 
 ## Config schema
 
 ```yaml
 defaults:
   iterations: 5
   warmup: 1
   timeout: 60
   servers:
     hgwdev: https://hgwdev.gi.ucsc.edu
     sandbox: https://hgwdev-braney.gi.ucsc.edu
     beta:   https://hgwbeta.soe.ucsc.edu
     rr:     https://genome.ucsc.edu
 
 cases:
   - id: case_id
     description: "..."
-    positions:
-      - {label: sparse, value: chr1:1000000-2000000}
-      - {label: dense,  value: chr19:50000000-51000000}
+    server: hgwdev          # one server for all variants in this case
     variants:
-      native: {server: hgwdev, db: hg19, hubUrl: ..., track: trackName}
-      lifted: {server: hgwdev, db: hg38, hubUrl: ..., track: trackName_qL}
+      native: User/sessionName_native     # user/sessionName
+      lifted: User/sessionName_lifted
     compare:
       - [native, lifted]
 ```
 
-Each variant URL is built as:
+Each variant value is a saved-session reference of the form
+`user/sessionName` (the same form as the `/s/<user>/<name>` short-link URL).
+Both `User/Name` and the prefix `/s/User/Name` are accepted.
+
+The URL the runner sends per iteration is:
 
 ```
-{server}/cgi-bin/hgTracks?db=DB&position=POS
-   &hideTracks=1&TRACK=full
-   &hubUrl=...
-   &hgt.trackImgOnly=1&hgt.reset=1&measureTiming=1
+{server}/cgi-bin/hgTracks?
+   hgS_doOtherUser=submit
+   &hgS_otherUserName=USER
+   &hgS_otherUserSessionName=NAME
+   &hgt.trackImgOnly=1
+   &measureTiming=1
 ```
 
-`hideTracks=1` plus the named track at `=full` isolates the single track.
-`hgt.reset=1` resets cart state per request, so cases do not contaminate each
-other. A fresh `requests.Session()` is also used per case to mint a new
-hgsid.
+Notes on URL choices:
+
+- `hgS_doOtherUser=submit` plus the user/session name causes hgTracks to
+  load the saved session into the cart (`cart.c:1715`). The session's saved
+  position is used.
+- `hgt.trackImgOnly=1` is the JS-redraw fast path: hgTracks emits the image
+  + map and returns without rendering the rest of the page. With
+  `measureTiming=1` it also emits the per-track timing block.
+- A fresh `requests.Session()` per case mints a new hgsid (and thus a fresh
+  cart) so cases do not contaminate each other.
 
 ## Adding a case
 
-1. Pick a track that exists both as a native annotation (or on its source
-   assembly) and as a quickLift'd target. For Mode C, build (or point to) a
-   side-by-side hub.
-2. Pick at least two positions: one sparse (low item count after lift) and
-   one dense. Position labels show up in `summary.tsv`.
-3. Add a stanza to `cases.yaml` following the schema above. List variant
-   pairs to compare under `compare`.
-4. Smoke-test with `--cases <new_id> --iterations 1 --warmup 0 -v` to verify
-   the URL renders and the per-track timing parses out.
+1. Save two sessions on the target server that differ only in the dimension
+   you want to measure (typically: native vs. quickLifted versions of the
+   same set of tracks). Each session should be saved at the position you
+   want it benchmarked at.
+2. Add a stanza to `cases.yaml` following the schema above.
+3. Smoke-test with `--cases <new_id> --iterations 1 --warmup 0 -v` to verify
+   sessions load and timings parse out.
 
 ## Output
 
 Two TSVs are written to `results/<YYYYMMDD-HHMMSS>/`:
 
-- `results.tsv` — one row per (case, variant, position, iteration) with
-  http_ms, load_ms, draw_ms, total_ms, status_code, error.
+- `results.tsv` — one row per (case, variant, iteration) with
+  http_ms, load_ms_sum, draw_ms_sum, n_tracks, total_ms, status_code, error.
 - `summary.tsv` — two sections:
-  1. per (case, position, variant): n, n_ok, http/load/draw/total median
-     and p90.
-  2. per (case, position, compare-pair): left vs right medians and the
+  1. per (case, variant): n, n_ok, http/load_sum/draw_sum/total median and p90.
+  2. per (case, compare-pair): left vs right total medians and the
      `right/left` ratio for each metric.
 
 A short pairwise table is also printed to stderr at the end of a run.
 
 ## Dependencies
 
 ```
 pip install requests pyyaml
 ```
 
 ## Notes
 
 - The script does not parallelize requests against a single server.
   quickLift renders are single-threaded per request; parallel requests would
   measure contention rather than work.
 - If hgTracks returns the bot-block page or an `errAbort`, the row is
   written with `error` set and `*_ms` empty rather than aborting the run.
-- Timing is wall time inside hgTracks for `load_ms` / `draw_ms`. HTTP wall
-  time also includes network and CGI startup; treat it as a sanity check,
-  not as the headline number.
+- `total_ms` is the wall time inside hgTracks for the full request (cart
+  load + track load + track draw + page assembly). `http_ms` adds network
+  and CGI startup; treat it as a sanity check, not as the headline number.
+- Each request reloads the saved session into a fresh cart, so the
+  per-request work includes session unmarshaling. That is consistent
+  across variants, so it cancels out in the ratio.
 - For paper-quality numbers, run repeatedly across hours of the day or
   pin to a quiet host; render times on a shared dev server have noticeable
   load-dependent jitter.