b85c12cf9af0ee1a954b8cced961bcbd909b7979
lrnassar
  Wed Apr 29 12:04:36 2026 -0700
Expand dbVar tracks to expose all six nstd186 source studies and add new Somatic and Other composites. refs #37406

Restructure the dbVar supertrack:
- Renamed from "dbVar Common Struct Var" to "dbVar Struct Var".
- dbVar Common SV: added subtracks for Lee, Abel, and Byrska-Bishop (the
three nstd186 source studies that were missing from our Curated Common
track), and for the American/East Asian/South Asian/Other populations.
- dbVar Conflict SV: description page refreshed; subtrack longLabel
clarified.
- dbVar Somatic SV (new): single subtrack pulling somatic_sv.bb from the
dbVar hub. Default hidden.
- dbVar Other SV (new): residual bucket for dbVar SVs not classified as
common, somatic, or clinical, split into Healthy and Phenotype subtracks.
Default hidden. NCBI sometimes calls this "presumed normal"; the
description page notes the equivalence. mergeSpannedItems on for the
dense subtracks (normal_healthy ~5.6M items, normal_phenotype ~410K,
somatic_sv ~67K).
- ClinVar SVs are not duplicated; description pages cross-link to the
existing ClinVar track instead.

Description pages: rewrite dbVarCommon.html and dbVarCurated.html, refresh
dbVarConflict.html, add dbVarSomatic.html and dbVarOther.html. Retire the
unused dbVar_common.html. Methods links now point at NCBI's dbVar Overview
rather than the FTP directory listing. searchTable termRegex widened to
^[den]ssv[0-9]+ so dssv* accessions in normal_healthy resolve.

Otto: stage downloads to release/\${db}.new/, validate per file (size
floor and 10% itemCount delta vs the current live copy), then atomically
swap via directory rename with a one-cycle .prev rollback. On validation
failure, leave .new/ in place for human inspection and exit non-zero so
the wrapper emails. On no-op runs the wrapper now stays silent.
checkNstd175.sh's "update done" message moved inside the update branch so
silence is honoured. New-file detection (via a knownFiles.txt manifest)
emails when NCBI adds a file we don't yet expose. knownFiles.txt itself
lives only at the deployment path under /hive/data/outside/otto/dbVar/,
not in the tree.

diff --git src/hg/makeDb/trackDb/human/dbVarCurated.html src/hg/makeDb/trackDb/human/dbVarCurated.html
index c34af448450..a90e666c4fb 100644
--- src/hg/makeDb/trackDb/human/dbVarCurated.html
+++ src/hg/makeDb/trackDb/human/dbVarCurated.html
@@ -1,99 +1,130 @@
 <h2>Description</h2>
 <p>
-The tracks listed here contain data from the
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/var_summary/#nstd186">
-nstd186 (NCBI Curated Common Structural Variants)</a> study. This is a collection of structural
-variants (SV) originally submitted to dbVar which are part of a study with at least 100 samples and
-have an allele frequency of &gt;=0.01 in at least one population. The complete dataset is imported
-from these common-population studies:
+This super-track groups structural variant (SV) tracks from
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/">dbVar</a>, NCBI's archive of human
+genomic structural variation. The data are mirrored from the
+<a target=_blank href="https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/">NCBI dbVar track
+hub</a>.
 </p>
 
 <p>
-<b>gnomAD Structural Variants</b>
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd166/">(nstd166)</a>:
- Catalog of SVs detected from the sequencing of the complete genome of 10,847 unrelated
-individuals from the GnomAD v2.1 release.</p>
-<p>
-<b>1000 Genomes Consortium Phase 3 Integrated SV</b>
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd219/">(estd219)</a>:
- Structural variants of the 1000 Genomes project Phase 3 as reported in a separate article
-specifically dedicated to the analysis of SVs. Many of these data are identical to those reported
-in the <a target=_blank 
-href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd214/">estd214</a> study.</p>
+There are four track collections in this super-track:
+</p>
+<ul>
+<li><a href="/cgi-bin/hgTrackUi?g=dbVar_common"><b>NCBI dbVar Curated Common Structural Variants
+(dbVar Common SV)</b></a>: Copy-number and other variants from the
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> study
+(NCBI Curated Common Structural Variants), split into subtracks by source study and by population.</li>
+<li><a href="/cgi-bin/hgTrackUi?g=dbVar_conflict"><b>NCBI dbVar Curated Conflict Variants
+(dbVar Conflict SV)</b></a>: Variants from nstd186 that overlap clinical variants in
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd102/">nstd102</a>
+(Clinical Structural Variants).</li>
+<li><a href="/cgi-bin/hgTrackUi?g=dbVar_somatic"><b>NCBI dbVar Somatic Structural Variants
+(dbVar Somatic SV)</b></a>: SVs with somatic origin, aggregated across six dbVar studies
+including COSMIC.</li>
+<li><a href="/cgi-bin/hgTrackUi?g=dbVar_other"><b>NCBI dbVar Other Structural Variants
+(dbVar Other SV)</b></a>: SVs in dbVar with no reported phenotype, or with phenotype but not
+clinical/somatic, excluding variants already present in the Common and Somatic tracks or in ClinVar.
+NCBI sometimes refers to this category as <em>presumed normal</em> SVs.</li>
+</ul>
+
 <p>
-<b>DECIPHER Common CNVs</b>
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd183/">(nstd183)</a>:
-Consensus set of common population CNVs selected from high-resolution controls sets where frequency
-information is available.
+Clinical structural variants from dbVar study nstd102 are not duplicated here; they are available
+in our dedicated <a href="/cgi-bin/hgTrackUi?g=clinvar">ClinVar</a> track (subtrack
+<em>ClinVar CNVs</em>), which pulls from the same underlying ClinVar XML release.
 </p>
 
+<h2>Source Studies in nstd186 (Common SV)</h2>
 <p>
-There are two tracks in this collection:
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> is a
+curated collection of SVs from studies with at least 100 samples and allele frequency &gt;= 0.01
+in at least one population. It aggregates data from six source studies:
+</p>
 <ul>
-<li><a href="/cgi-bin/hgTrackUi?g=dbVar_common">
-<b>NCBI dbVar Curated Common Structural Variants (dbVar Common SV)</b></a>: Shows copy number
-variants calls (variants &gt;=50 nucleotides) from the <a target=_blank 
-href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> study.</li>
-<li><a href="/cgi-bin/hgTrackUi?g=dbVar_conflict">
-<b>NCBI dbVar Curated Conflict Variants (dbVar Conflict SV)</b></a>: Shows copy number
-variants from <a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a>
-(NCBI Curated Common Structural Variants) that overlap with
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd102/">nstd102</a> (Clinical
-Structural Variants).</li>
+<li><b>1000 Genomes Consortium Phase 3 Integrated SV</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd219/">estd219</a>),
+added 2016</li>
+<li><b>gnomAD Structural Variants</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd166/">nstd166</a>),
+added 2019 &#8212; SVs from the sequencing of 10,847 unrelated individuals (gnomAD v2.1)</li>
+<li><b>DECIPHER Consensus CNVs</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd183/">nstd183</a>),
+added 2020</li>
+<li><b>Lee et al. 2020</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd194/">nstd194</a>),
+added 2021</li>
+<li><b>Abel et al. 2020</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd200/">nstd200</a>),
+added 2021</li>
+<li><b>Byrska-Bishop et al. 2022</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd206/">nstd206</a>),
+added 2022 &#8212; high-coverage WGS of the expanded 1000 Genomes sample set</li>
 </ul>
+
+<p>
+Variants must be of a qualifying structural variant type (deletions, duplications, insertions,
+copy number variants, and mobile element variants). For the latest statistics and version
+history, see the
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/">nstd186 summary
+page</a> at NCBI.
 </p>
 
 <h2>Display Conventions</h2>
 <p>
-These tracks are multi-view composite tracks that contain multiple data types (views). Each view
-within a track has separate display controls, as described
-<a href="../../goldenPath/help/multiView.html">here</a>. Some dbVar tracks
-contain multiple subtracks, corresponding to subsets of data. If a track contains many subtracks,
-only some subtracks will be displayed by default. The user can select which subtracks are displayed
-via the display controls on the track details page.
+These tracks are composite tracks that contain multiple subtracks. Each subtrack has its own
+display controls, as described <a href="../../goldenPath/help/multiView.html">here</a>. Items are
+colored by variant type using the dbVar color scheme
+(<a target="_blank" href="https://www.ncbi.nlm.nih.gov/dbvar/content/overview/">dbVar Overview</a>):
+</p>
+<table>
+<thead><tr>
+<th style="border-bottom: 2px solid #6678B1;">Color</th>
+<th style="border-bottom: 2px solid #6678B1;">Variant Type(s)</th>
+</tr></thead>
+<tbody>
+<tr><th bgcolor="#ff0000"></th><td>deletion, copy number loss</td></tr>
+<tr><th bgcolor="#0000ff"></th><td>duplication, copy number gain, insertion</td></tr>
+<tr><th bgcolor="#4f1c73"></th><td>copy number variation</td></tr>
+</tbody>
+</table>
+<p>
+Some composites display additional colors for less common variant types. Refer to each composite
+track's description page for the full legend.
 </p>
 
 <h2>Data Access</h2>
 <p>
 The raw data can be explored interactively with the
 <a href="hgTables">Table Browser</a>, or the
 <a href="hgIntegrator">Data Integrator</a>. For automated analysis,
 the data may be queried from our
 <a href="../../goldenPath/help/api.html">REST API</a>.
-<p>The data can also be found directly from the <a target=_blank 
-href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/#data_access">dbVar 
-nstd186 data access</a>, as well as in the
-<a href="hgHubConnect?hubUrl=
-https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on">
-dbVar Track Hub</a>, where additional subtracks are included. For questions about
-dbVar track data, please contact <A HREF="mailto:&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.
-n&#108;&#109;.
-&#110;&#105;&#104;.
-&#103;&#111;v">
-&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.
-n&#108;&#109;.
-&#110;&#105;&#104;.
-&#103;&#111;v</A>.
-<!-- above address is dbvar at ncbi.nlm.nih.gov -->
 </p>
-
+<p>
+The data can also be found directly at the
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/#data_access">dbVar
+nstd186 data access</a> page, or in the
+<a href="hgHubConnect?hubUrl=https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on">dbVar
+Track Hub</a>, where additional subtracks (e.g., population-exclusive variants, ClinVar SVs) are
+available. For questions about dbVar track data, please contact
+<A HREF="mailto:&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.n&#108;&#109;.&#110;&#105;&#104;.&#103;&#111;v">&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.n&#108;&#109;.&#110;&#105;&#104;.&#103;&#111;v</A>.
+<!-- above address is dbvar at ncbi.nlm.nih.gov -->
 </p>
 
 <h2>Credits</h2>
 <p>
-Thanks to the dbVAR team at NCBI, especially John Lopez and Timothy Hefferon for technical 
+Thanks to the dbVar team at NCBI, especially John Lopez and Timothy Hefferon for technical
 coordination and consultation, and to Christopher Lee, Anna Benet-Pages, and Daniel Schmelter of
-the Genome Browser team for engineering the track display.</p>
+the Genome Browser team for engineering the track display.
+</p>
 
 <h2>References</h2>
-
 <p>
 Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M,
 Zhou G <em>et al</em>.
 <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gks1213" target="_blank">
 DbVar and DGVa: public archives for genomic structural variation</a>.
 <em>Nucleic Acids Res</em>. 2013 Jan;41(Database issue):D936-41.
-PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193291" target="_blank">23193291</a>; PMC: <a
-href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531204/" target="_blank">PMC3531204</a>
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193291" target="_blank">23193291</a>;
+PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531204/" target="_blank">PMC3531204</a>
 </p>
-