6b0d68657267f1e02c47d4224ea62446bbbb2ba0 max Fri May 22 06:55:52 2026 -0700 small non-AI changes to the html docs pages of the long-read SV tracks diff --git src/hg/makeDb/trackDb/human/dbVarNr.html src/hg/makeDb/trackDb/human/dbVarNr.html new file mode 100644 index 00000000000..7e45efdd262 --- /dev/null +++ src/hg/makeDb/trackDb/human/dbVarNr.html @@ -0,0 +1,175 @@ +

Description

+ +

+This track shows the full non-redundant (NR) structural variant catalog +curated by NCBI +dbVar: deletions, duplications, and insertions aggregated across more than +150 studies (e.g. 1000 Genomes Phase 3, Simons Genome Diversity Project, +ClinGen, ClinVar) into a single consolidated set per variant type. In the +source release, each type (DEL, DUP, INS) is distributed separately; for this +track all three are merged into one bigBed so they can be filtered and +browsed together. As of the current build there are ~4.6 million records +(2.3M deletions, 0.6M duplications, 1.7M insertions).

+ +

+Each record represents a unique genomic placement. When multiple +submitted structural variants (ssv/nsv) have the same coordinates on the +reference, dbVar collapses them into one NR record and the record's +variantCount field counts how many were merged. Only exact +coordinate matches are collapsed; partial overlaps keep separate rows.

+ +

What merges into each type

+ + + +

Subsets

+ +

+dbVar ships three overlapping, clinically-oriented subsets of each NR +catalog, and each record here is tagged with its memberships via the +subsets field:

+ +

+Most NR records are neither common nor curated as pathogenic/somatic; +their subsets field is empty. A record can belong to multiple +subsets simultaneously (e.g. both common and pathogenic) +when different studies contribute different calls at the same +placement.

+ +

Length fields and bin sizes

+ +

Each record carries two numeric length fields:

+ + +

+On top of svLen, dbVar also pre-bins each record into one of +three reference-span buckets stored in the binSize column:

+ +

+Use the numeric svLen filter for arbitrary length cutoffs and +the categorical binSize filter for the standard buckets. The +bed score is derived from binSize +(small = 100, medium = 500, large = 1000) so dense-mode shading +emphasises larger events.

+ +

Display conventions

+ +

Items are colored by SV type:

+ + +

The item label is the first dbVar variant ID for the record (an +nssv*, nsv*, or essv* accession). When a +placement merges multiple IDs, the full list is stored in the +variants field on the details page and linked to the dbVar +variant page. Similarly, when an NR record aggregates calls from +multiple studies/methods/platforms, those columns are +semicolon-separated lists.

+ +

Filters

+ +

The track configuration page exposes these filters:

+ + +

Data Access

+ +

+The data can be explored interactively in table format with the +Table Browser or the +Data Integrator, and accessed +programmatically through our API, +track=dbVarNr.

+ +

The bigBed is available from our download server at + +hgdownload.soe.ucsc.edu/gbdb/hg38/bbi/dbVar/nr.bb. The upstream source +TSV / BED / BEDPE files (released monthly) are available from the + +NCBI dbVar GitHub repository and the + +dbVar FTP site.

+ +

Credits

+ +

+Thanks to the NCBI dbVar team for curating, merging, and releasing +the non-redundant structural-variant datasets on a monthly cadence.

+ +

References

+ +

+Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, +Chen C, Maguire M, Corbett M, Zhou G, Paschall J, Ananiev V, Flicek P, +Church DM. + +dbVar and DGVa: public archives for genomic structural variation. +Nucleic Acids Res. 2013 Jan;41(Database issue):D936-D941. +PMID: 23193291

+ +

NCBI dbVar: Human Non-Redundant Reference Datasets to Help +Interpret Structural Variants. NCBI Insights, 27 Sep 2018. + +ncbiinsights.ncbi.nlm.nih.gov.

+ +

Phan L, Jin Y, Zhang H, Qiang W, Shekhtman E, Shao D, et al. +ALFA: Allele Frequency Aggregator. In: + +NCBI Handbook.