b85c12cf9af0ee1a954b8cced961bcbd909b7979
lrnassar
  Wed Apr 29 12:04:36 2026 -0700
Expand dbVar tracks to expose all six nstd186 source studies and add new Somatic and Other composites. refs #37406

Restructure the dbVar supertrack:
- Renamed from "dbVar Common Struct Var" to "dbVar Struct Var".
- dbVar Common SV: added subtracks for Lee, Abel, and Byrska-Bishop (the
three nstd186 source studies that were missing from our Curated Common
track), and for the American/East Asian/South Asian/Other populations.
- dbVar Conflict SV: description page refreshed; subtrack longLabel
clarified.
- dbVar Somatic SV (new): single subtrack pulling somatic_sv.bb from the
dbVar hub. Default hidden.
- dbVar Other SV (new): residual bucket for dbVar SVs not classified as
common, somatic, or clinical, split into Healthy and Phenotype subtracks.
Default hidden. NCBI sometimes calls this "presumed normal"; the
description page notes the equivalence. mergeSpannedItems on for the
dense subtracks (normal_healthy ~5.6M items, normal_phenotype ~410K,
somatic_sv ~67K).
- ClinVar SVs are not duplicated; description pages cross-link to the
existing ClinVar track instead.

Description pages: rewrite dbVarCommon.html and dbVarCurated.html, refresh
dbVarConflict.html, add dbVarSomatic.html and dbVarOther.html. Retire the
unused dbVar_common.html. Methods links now point at NCBI's dbVar Overview
rather than the FTP directory listing. searchTable termRegex widened to
^[den]ssv[0-9]+ so dssv* accessions in normal_healthy resolve.

Otto: stage downloads to release/\${db}.new/, validate per file (size
floor and 10% itemCount delta vs the current live copy), then atomically
swap via directory rename with a one-cycle .prev rollback. On validation
failure, leave .new/ in place for human inspection and exit non-zero so
the wrapper emails. On no-op runs the wrapper now stays silent.
checkNstd175.sh's "update done" message moved inside the update branch so
silence is honoured. New-file detection (via a knownFiles.txt manifest)
emails when NCBI adds a file we don't yet expose. knownFiles.txt itself
lives only at the deployment path under /hive/data/outside/otto/dbVar/,
not in the tree.

diff --git src/hg/makeDb/trackDb/human/dbVarCommon.html src/hg/makeDb/trackDb/human/dbVarCommon.html
index 015acbff1e5..cce6a323ed3 100644
--- src/hg/makeDb/trackDb/human/dbVarCommon.html
+++ src/hg/makeDb/trackDb/human/dbVarCommon.html
@@ -1,110 +1,158 @@
 <h2>Description</h2>
 
 <p>
-This track displays common copy number genomic variations from <a target=_blank
-href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186 (NCBI Curated Common
-Structural Variants)</a>, divided into subtracks according to population and source of original
-submission.
+This track displays common structural variants (SVs) from
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186
+(NCBI Curated Common Structural Variants)</a>, divided into subtracks by source study and by
+population.
 </p>
 
 <p>
-This curated dataset of all structural variants in dbVar includes variants from <b>gnomAD</b>, <b>1000
-Genomes Phase 3</b>, and <b>DECIPHER</b> (dbVar studies
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd166/">nstd166</a>,
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd219/">estd219</a>, and
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd183/">nstd183</a>, respectively).
+nstd186 is a curated collection of structural variants in
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/">dbVar</a> from studies with at least
+100 samples, that include allele frequency data, and that have an allele frequency of &gt;=0.01
+in at least one population. It includes copy number gains and losses, copy number variations,
+duplications, deletions, insertions, and mobile element variants (ALU, LINE1, SVA, HERV).
 </p>
 
 <p>
-It only includes copy number gain, copy number loss, copy number variation, duplications, and
-deletions (including mobile element deletions), that are part of a study with at least 100 samples,
-include allele frequency data, and have an allele frequency of &gt;=0.01 in at least one population.
+The dataset aggregates variants from six source studies:
 </p>
+<ul>
+<li><b>gnomAD Structural Variants</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd166/">nstd166</a>):
+SVs from the sequencing of 10,847 unrelated individuals in the gnomAD v2.1 release.</li>
+<li><b>1000 Genomes Consortium Phase 3 Integrated SV</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd219/">estd219</a>):
+SVs from the 1000 Genomes Project Phase 3.</li>
+<li><b>DECIPHER Consensus CNVs</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd183/">nstd183</a>):
+Consensus common population CNVs from high-resolution control sets.</li>
+<li><b>Lee et al. 2020</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd194/">nstd194</a>).</li>
+<li><b>Abel et al. 2020</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd200/">nstd200</a>).</li>
+<li><b>Byrska-Bishop et al. 2022</b>
+(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd206/">nstd206</a>):
+High-coverage whole-genome sequencing of the expanded 1000 Genomes sample set.</li>
+</ul>
 
 <p>
-For more information on the number of variant calls and latest statistics for nstd186 see
-<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/">Summary of nstd186</a>
-(NCBI Curated Common Structural Variants).
+For the latest nstd186 variant call counts and version history, see the
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/">nstd186
+summary page</a> at NCBI.
 </p>
 
+<h2>Subtracks</h2>
+
 <p>
-There are six subtracks in this track set:
+<b>Per-source-study subtracks</b> (variants from nstd186 attributed to one of the six component
+studies):
 </p>
+<ul>
+<li><b>dbVar Curated gnomAD SVs</b></li>
+<li><b>dbVar Curated 1000 Genomes SVs</b></li>
+<li><b>dbVar Curated DECIPHER SVs</b></li>
+<li><b>dbVar Curated Lee SVs</b></li>
+<li><b>dbVar Curated Abel SVs</b></li>
+<li><b>dbVar Curated Byrska-Bishop SVs</b></li>
+</ul>
 
 <p>
+<b>Per-population subtracks</b> (variants with AF &gt;= 0.01 aggregated across nstd186 source
+studies for each super-population):
+</p>
 <ul>
-<li><b>NCBI Curated Common SVs: African -</b> 
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186">Variants</a> with AF &gt;= 0.01 for 
-African Population.</li>
-<li><b>NCBI Curated Common SVs: European -</b>
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186">Variants</a> with AF &gt;= 0.01 for 
-European Population.</li>
-<li><b>NCBI Curated Common SVs: all populations -</b>
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186">Variants</a> with AF &gt;= 0.01 for 
-Global Population.</li>
-<li><b>NCBI Curated Common SVs: all populations from gnomAD - </b>
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186">Variants</a> with AF &gt;= 0.01 from 
-gnomAD Structural Variants.</li>
-<li><b>NCBI Curated Common SVs: all populations from 1000 Genomes - </b>
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186">Variants</a> with AF &gt;= 0.01 from 
-1000 Genomes Consortium Phase 3 Integrated SV.</li>
-<li><b>NCBI Curated Common SVs: all populations from DECIPHER -</b>
-<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186">Variants</a> with AF &gt;= 0.01 from 
-DECIPHER Consensus CNVs.</li>
+<li><b>dbVar Curated All Populations</b> (Global)</li>
+<li><b>dbVar Curated African SVs</b></li>
+<li><b>dbVar Curated American SVs</b></li>
+<li><b>dbVar Curated East Asian SVs</b></li>
+<li><b>dbVar Curated European SVs</b></li>
+<li><b>dbVar Curated South Asian SVs</b></li>
+<li><b>dbVar Curated Other Pop SVs</b> &#8212; samples of mixed, admixed, or
+uncategorized ancestry that do not map cleanly onto the five super-populations above.</li>
 </ul>
+
+<p>
+The NCBI <a href="hgHubConnect?hubUrl=https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on">dbVar
+Track Hub</a> additionally provides <em>population-only</em> variants (variants common in one
+population but not in any other): African only, American only, East Asian only, European only,
+and South Asian only. These are not loaded as native Genome Browser tracks; connect to the hub to
+view them.
 </p>
 
 <h2>Display Conventions and Configuration</h2>
-Items in all subtracks follow the same conventions: items are colored by variant type, and are 
-based on the dbVar colors described in the 
-<a target="_blank" href="https://www.ncbi.nlm.nih.gov/dbvar/content/overview/">dbVar Overview page</a>. 
-<b><font color="red">Red</font></b> for copy number loss or deletion,
-<b><font color="blue">blue</font></b> for copy number gain or duplication, and
-<b><font color="#662180">violet</font></b> for copy number variation. 
+
+<p>
+Items in all subtracks follow the same conventions. Variants are colored by type, using the dbVar
+color scheme described in the
+<a target="_blank" href="https://www.ncbi.nlm.nih.gov/dbvar/content/overview/">dbVar Overview
+page</a>:
 </p>
+<table>
+<thead><tr>
+<th style="border-bottom: 2px solid #6678B1;">Color</th>
+<th style="border-bottom: 2px solid #6678B1;">Variant Type(s)</th>
+</tr></thead>
+<tbody>
+<tr><th bgcolor="#ff0000"></th><td>copy number loss, deletion (including mobile element deletions)</td></tr>
+<tr><th bgcolor="#0000ff"></th><td>copy number gain, duplication, insertion (including mobile element insertions)</td></tr>
+<tr><th bgcolor="#4f1c73"></th><td>copy number variation</td></tr>
+</tbody>
+</table>
 
 <p>
-<b>Mouseover</b> on items indicates genes affected, size, variant type, and allele frequencies (AF). 
-All tracks can be filtered according to the <b>Variant Size</b> and <b>Variant Type</b>.
+<b>Mouseover</b> on items shows genes affected, size, variant type, allele count (AC), allele
+number (AN), allele frequency (AF), and population (in per-population subtracks).
+</p>
+
+<p>
+Subtracks can be filtered by:
+</p>
+<ul>
+<li><b>Variant Type</b></li>
+<li><b>Variant Size</b> (Under 10KB, 10KB to 100KB, 100KB to 1MB, Over 1MB)</li>
+<li><b>Frequency Range</b> (Under 0.02, 0.02 to 0.05, 0.05 to 0.1, 0.1 to 0.2, 0.2 to 0.5,
+Over 0.5)</li>
+</ul>
+
+<p>
+The <b>Hide empty subtracks</b> option on the track configuration page hides subtracks that have
+no data in the current viewing window. This is enabled by default and can be toggled off.
 </p>
 
 <h2>Data Access</h2>
+<p>
 The raw data can be explored interactively with the
 <a href="../../hgTables">Table Browser</a>, or the
 <a href="../../hgIntegrator">Data Integrator</a>. For automated analysis,
 the data may be queried from our
 <a href="../../goldenPath/help/api.html">REST API</a>.
 </p>
-<p>The data can also be found directly from the <a target=_blank 
-href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/#data_access">dbVar 
-nstd186 data access</a>, as well as in the
-<a href="hgHubConnect?hubUrl=
-https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on">
-dbVar Track Hub</a>, where additional subtracks are included. For questions about
-dbVar track data, please contact <A HREF="mailto:&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.
-n&#108;&#109;.&#110;&#105;&#104;.&#103;&#111;v">
-&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.n&#108;&#109;.&#110;&#105;&#104;.&#103;&#111;v
-</A>.
+<p>
+The data can also be found directly at the
+<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/#data_access">dbVar
+nstd186 data access</a> page, or in the
+<a href="hgHubConnect?hubUrl=https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on">dbVar
+Track Hub</a>. For questions about dbVar track data, please contact
+<A HREF="mailto:&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.n&#108;&#109;.&#110;&#105;&#104;.&#103;&#111;v">&#100;&#98;&#118;&#97;r&#64;&#110;&#99;&#98;&#105;.n&#108;&#109;.&#110;&#105;&#104;.&#103;&#111;v</A>.
 <!-- above address is dbvar at ncbi.nlm.nih.gov -->
 </p>
 
-
 <h2>Credits</h2>
 <p>
-Thanks to the dbVAR team at NCBI, especially John Lopez and Timothy Hefferon for technical 
+Thanks to the dbVar team at NCBI, especially John Lopez and Timothy Hefferon for technical
 coordination and consultation, and to Christopher Lee, Anna Benet-Pages, and Daniel Schmelter, of
 the Genome Browser team for engineering the track display.
 </p>
 
 <h2>References</h2>
 <p>
 Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M,
 Zhou G <em>et al</em>.
 <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gks1213" target="_blank">
 DbVar and DGVa: public archives for genomic structural variation</a>.
 <em>Nucleic Acids Res</em>. 2013 Jan;41(Database issue):D936-41.
-PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193291" target="_blank">23193291</a>; PMC: <a
-href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531204/" target="_blank">PMC3531204</a>
+PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193291" target="_blank">23193291</a>;
+PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531204/" target="_blank">PMC3531204</a>
 </p>
-
-