b85c12cf9af0ee1a954b8cced961bcbd909b7979 lrnassar Wed Apr 29 12:04:36 2026 -0700 Expand dbVar tracks to expose all six nstd186 source studies and add new Somatic and Other composites. refs #37406 Restructure the dbVar supertrack: - Renamed from "dbVar Common Struct Var" to "dbVar Struct Var". - dbVar Common SV: added subtracks for Lee, Abel, and Byrska-Bishop (the three nstd186 source studies that were missing from our Curated Common track), and for the American/East Asian/South Asian/Other populations. - dbVar Conflict SV: description page refreshed; subtrack longLabel clarified. - dbVar Somatic SV (new): single subtrack pulling somatic_sv.bb from the dbVar hub. Default hidden. - dbVar Other SV (new): residual bucket for dbVar SVs not classified as common, somatic, or clinical, split into Healthy and Phenotype subtracks. Default hidden. NCBI sometimes calls this "presumed normal"; the description page notes the equivalence. mergeSpannedItems on for the dense subtracks (normal_healthy ~5.6M items, normal_phenotype ~410K, somatic_sv ~67K). - ClinVar SVs are not duplicated; description pages cross-link to the existing ClinVar track instead. Description pages: rewrite dbVarCommon.html and dbVarCurated.html, refresh dbVarConflict.html, add dbVarSomatic.html and dbVarOther.html. Retire the unused dbVar_common.html. Methods links now point at NCBI's dbVar Overview rather than the FTP directory listing. searchTable termRegex widened to ^[den]ssv[0-9]+ so dssv* accessions in normal_healthy resolve. Otto: stage downloads to release/\${db}.new/, validate per file (size floor and 10% itemCount delta vs the current live copy), then atomically swap via directory rename with a one-cycle .prev rollback. On validation failure, leave .new/ in place for human inspection and exit non-zero so the wrapper emails. On no-op runs the wrapper now stays silent. checkNstd175.sh's "update done" message moved inside the update branch so silence is honoured. New-file detection (via a knownFiles.txt manifest) emails when NCBI adds a file we don't yet expose. knownFiles.txt itself lives only at the deployment path under /hive/data/outside/otto/dbVar/, not in the tree. diff --git src/hg/makeDb/trackDb/human/dbVarCurated.html src/hg/makeDb/trackDb/human/dbVarCurated.html index c34af448450..a90e666c4fb 100644 --- src/hg/makeDb/trackDb/human/dbVarCurated.html +++ src/hg/makeDb/trackDb/human/dbVarCurated.html @@ -1,99 +1,130 @@ <h2>Description</h2> <p> -The tracks listed here contain data from the -<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/var_summary/#nstd186"> -nstd186 (NCBI Curated Common Structural Variants)</a> study. This is a collection of structural -variants (SV) originally submitted to dbVar which are part of a study with at least 100 samples and -have an allele frequency of >=0.01 in at least one population. The complete dataset is imported -from these common-population studies: +This super-track groups structural variant (SV) tracks from +<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/">dbVar</a>, NCBI's archive of human +genomic structural variation. The data are mirrored from the +<a target=_blank href="https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/">NCBI dbVar track +hub</a>. </p> <p> -<b>gnomAD Structural Variants</b> -<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd166/">(nstd166)</a>: - Catalog of SVs detected from the sequencing of the complete genome of 10,847 unrelated -individuals from the GnomAD v2.1 release.</p> -<p> -<b>1000 Genomes Consortium Phase 3 Integrated SV</b> -<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd219/">(estd219)</a>: - Structural variants of the 1000 Genomes project Phase 3 as reported in a separate article -specifically dedicated to the analysis of SVs. Many of these data are identical to those reported -in the <a target=_blank -href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd214/">estd214</a> study.</p> +There are four track collections in this super-track: +</p> +<ul> +<li><a href="/cgi-bin/hgTrackUi?g=dbVar_common"><b>NCBI dbVar Curated Common Structural Variants +(dbVar Common SV)</b></a>: Copy-number and other variants from the +<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> study +(NCBI Curated Common Structural Variants), split into subtracks by source study and by population.</li> +<li><a href="/cgi-bin/hgTrackUi?g=dbVar_conflict"><b>NCBI dbVar Curated Conflict Variants +(dbVar Conflict SV)</b></a>: Variants from nstd186 that overlap clinical variants in +<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd102/">nstd102</a> +(Clinical Structural Variants).</li> +<li><a href="/cgi-bin/hgTrackUi?g=dbVar_somatic"><b>NCBI dbVar Somatic Structural Variants +(dbVar Somatic SV)</b></a>: SVs with somatic origin, aggregated across six dbVar studies +including COSMIC.</li> +<li><a href="/cgi-bin/hgTrackUi?g=dbVar_other"><b>NCBI dbVar Other Structural Variants +(dbVar Other SV)</b></a>: SVs in dbVar with no reported phenotype, or with phenotype but not +clinical/somatic, excluding variants already present in the Common and Somatic tracks or in ClinVar. +NCBI sometimes refers to this category as <em>presumed normal</em> SVs.</li> +</ul> + <p> -<b>DECIPHER Common CNVs</b> -<a href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd183/">(nstd183)</a>: -Consensus set of common population CNVs selected from high-resolution controls sets where frequency -information is available. +Clinical structural variants from dbVar study nstd102 are not duplicated here; they are available +in our dedicated <a href="/cgi-bin/hgTrackUi?g=clinvar">ClinVar</a> track (subtrack +<em>ClinVar CNVs</em>), which pulls from the same underlying ClinVar XML release. </p> +<h2>Source Studies in nstd186 (Common SV)</h2> <p> -There are two tracks in this collection: +<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> is a +curated collection of SVs from studies with at least 100 samples and allele frequency >= 0.01 +in at least one population. It aggregates data from six source studies: +</p> <ul> -<li><a href="/cgi-bin/hgTrackUi?g=dbVar_common"> -<b>NCBI dbVar Curated Common Structural Variants (dbVar Common SV)</b></a>: Shows copy number -variants calls (variants >=50 nucleotides) from the <a target=_blank -href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> study.</li> -<li><a href="/cgi-bin/hgTrackUi?g=dbVar_conflict"> -<b>NCBI dbVar Curated Conflict Variants (dbVar Conflict SV)</b></a>: Shows copy number -variants from <a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd186/">nstd186</a> -(NCBI Curated Common Structural Variants) that overlap with -<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd102/">nstd102</a> (Clinical -Structural Variants).</li> +<li><b>1000 Genomes Consortium Phase 3 Integrated SV</b> +(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/estd219/">estd219</a>), +added 2016</li> +<li><b>gnomAD Structural Variants</b> +(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd166/">nstd166</a>), +added 2019 — SVs from the sequencing of 10,847 unrelated individuals (gnomAD v2.1)</li> +<li><b>DECIPHER Consensus CNVs</b> +(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd183/">nstd183</a>), +added 2020</li> +<li><b>Lee et al. 2020</b> +(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd194/">nstd194</a>), +added 2021</li> +<li><b>Abel et al. 2020</b> +(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd200/">nstd200</a>), +added 2021</li> +<li><b>Byrska-Bishop et al. 2022</b> +(<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd206/">nstd206</a>), +added 2022 — high-coverage WGS of the expanded 1000 Genomes sample set</li> </ul> + +<p> +Variants must be of a qualifying structural variant type (deletions, duplications, insertions, +copy number variants, and mobile element variants). For the latest statistics and version +history, see the +<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/">nstd186 summary +page</a> at NCBI. </p> <h2>Display Conventions</h2> <p> -These tracks are multi-view composite tracks that contain multiple data types (views). Each view -within a track has separate display controls, as described -<a href="../../goldenPath/help/multiView.html">here</a>. Some dbVar tracks -contain multiple subtracks, corresponding to subsets of data. If a track contains many subtracks, -only some subtracks will be displayed by default. The user can select which subtracks are displayed -via the display controls on the track details page. +These tracks are composite tracks that contain multiple subtracks. Each subtrack has its own +display controls, as described <a href="../../goldenPath/help/multiView.html">here</a>. Items are +colored by variant type using the dbVar color scheme +(<a target="_blank" href="https://www.ncbi.nlm.nih.gov/dbvar/content/overview/">dbVar Overview</a>): +</p> +<table> +<thead><tr> +<th style="border-bottom: 2px solid #6678B1;">Color</th> +<th style="border-bottom: 2px solid #6678B1;">Variant Type(s)</th> +</tr></thead> +<tbody> +<tr><th bgcolor="#ff0000"></th><td>deletion, copy number loss</td></tr> +<tr><th bgcolor="#0000ff"></th><td>duplication, copy number gain, insertion</td></tr> +<tr><th bgcolor="#4f1c73"></th><td>copy number variation</td></tr> +</tbody> +</table> +<p> +Some composites display additional colors for less common variant types. Refer to each composite +track's description page for the full legend. </p> <h2>Data Access</h2> <p> The raw data can be explored interactively with the <a href="hgTables">Table Browser</a>, or the <a href="hgIntegrator">Data Integrator</a>. For automated analysis, the data may be queried from our <a href="../../goldenPath/help/api.html">REST API</a>. -<p>The data can also be found directly from the <a target=_blank -href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/#data_access">dbVar -nstd186 data access</a>, as well as in the -<a href="hgHubConnect?hubUrl= -https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on"> -dbVar Track Hub</a>, where additional subtracks are included. For questions about -dbVar track data, please contact <A HREF="mailto:dbvar@ncbi. -nlm. -nih. -gov"> -dbvar@ncbi. -nlm. -nih. -gov</A>. -<!-- above address is dbvar at ncbi.nlm.nih.gov --> </p> - +<p> +The data can also be found directly at the +<a target=_blank href="https://www.ncbi.nlm.nih.gov/dbvar/content/common_summary/#data_access">dbVar +nstd186 data access</a> page, or in the +<a href="hgHubConnect?hubUrl=https://ftp.ncbi.nlm.nih.gov/pub/dbVar/sandbox/dbvarhub/hub.txt&hgHub_do_redirect=on">dbVar +Track Hub</a>, where additional subtracks (e.g., population-exclusive variants, ClinVar SVs) are +available. For questions about dbVar track data, please contact +<A HREF="mailto:dbvar@ncbi.nlm.nih.gov">dbvar@ncbi.nlm.nih.gov</A>. +<!-- above address is dbvar at ncbi.nlm.nih.gov --> </p> <h2>Credits</h2> <p> -Thanks to the dbVAR team at NCBI, especially John Lopez and Timothy Hefferon for technical +Thanks to the dbVar team at NCBI, especially John Lopez and Timothy Hefferon for technical coordination and consultation, and to Christopher Lee, Anna Benet-Pages, and Daniel Schmelter of -the Genome Browser team for engineering the track display.</p> +the Genome Browser team for engineering the track display. +</p> <h2>References</h2> - <p> Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M, Zhou G <em>et al</em>. <a href="https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gks1213" target="_blank"> DbVar and DGVa: public archives for genomic structural variation</a>. <em>Nucleic Acids Res</em>. 2013 Jan;41(Database issue):D936-41. -PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193291" target="_blank">23193291</a>; PMC: <a -href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531204/" target="_blank">PMC3531204</a> +PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/23193291" target="_blank">23193291</a>; +PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531204/" target="_blank">PMC3531204</a> </p> -