6b0d68657267f1e02c47d4224ea62446bbbb2ba0 max Fri May 22 06:55:52 2026 -0700 small non-AI changes to the html docs pages of the long-read SV tracks diff --git src/hg/makeDb/trackDb/human/dbVarNr.html src/hg/makeDb/trackDb/human/dbVarNr.html new file mode 100644 index 00000000000..7e45efdd262 --- /dev/null +++ src/hg/makeDb/trackDb/human/dbVarNr.html @@ -0,0 +1,175 @@ +
+This track shows the full non-redundant (NR) structural variant catalog +curated by NCBI +dbVar: deletions, duplications, and insertions aggregated across more than +150 studies (e.g. 1000 Genomes Phase 3, Simons Genome Diversity Project, +ClinGen, ClinVar) into a single consolidated set per variant type. In the +source release, each type (DEL, DUP, INS) is distributed separately; for this +track all three are merged into one bigBed so they can be filtered and +browsed together. As of the current build there are ~4.6 million records +(2.3M deletions, 0.6M duplications, 1.7M insertions).
+ ++Each record represents a unique genomic placement. When multiple +submitted structural variants (ssv/nsv) have the same coordinates on the +reference, dbVar collapses them into one NR record and the record's +variantCount field counts how many were merged. Only exact +coordinate matches are collapsed; partial overlaps keep separate rows.
+ ++dbVar ships three overlapping, clinically-oriented subsets of each NR +catalog, and each record here is tagged with its memberships via the +subsets field:
++Most NR records are neither common nor curated as pathogenic/somatic; +their subsets field is empty. A record can belong to multiple +subsets simultaneously (e.g. both common and pathogenic) +when different studies contribute different calls at the same +placement.
+ +Each record carries two numeric length fields:
++On top of svLen, dbVar also pre-bins each record into one of +three reference-span buckets stored in the binSize column:
++Use the numeric svLen filter for arbitrary length cutoffs and +the categorical binSize filter for the standard buckets. The +bed score is derived from binSize +(small = 100, medium = 500, large = 1000) so dense-mode shading +emphasises larger events.
+ +Items are colored by SV type:
+The item label is the first dbVar variant ID for the record (an +nssv*, nsv*, or essv* accession). When a +placement merges multiple IDs, the full list is stored in the +variants field on the details page and linked to the dbVar +variant page. Similarly, when an NR record aggregates calls from +multiple studies/methods/platforms, those columns are +semicolon-separated lists.
+ +The track configuration page exposes these filters:
++The data can be explored interactively in table format with the +Table Browser or the +Data Integrator, and accessed +programmatically through our API, +track=dbVarNr.
+ +The bigBed is available from our download server at + +hgdownload.soe.ucsc.edu/gbdb/hg38/bbi/dbVar/nr.bb. The upstream source +TSV / BED / BEDPE files (released monthly) are available from the + +NCBI dbVar GitHub repository and the + +dbVar FTP site.
+ ++Thanks to the NCBI dbVar team for curating, merging, and releasing +the non-redundant structural-variant datasets on a monthly cadence.
+ ++Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, +Chen C, Maguire M, Corbett M, Zhou G, Paschall J, Ananiev V, Flicek P, +Church DM. + +dbVar and DGVa: public archives for genomic structural variation. +Nucleic Acids Res. 2013 Jan;41(Database issue):D936-D941. +PMID: 23193291
+ +NCBI dbVar: Human Non-Redundant Reference Datasets to Help +Interpret Structural Variants. NCBI Insights, 27 Sep 2018. + +ncbiinsights.ncbi.nlm.nih.gov.
+ +Phan L, Jin Y, Zhang H, Qiang W, Shekhtman E, Shao D, et al. +ALFA: Allele Frequency Aggregator. In: + +NCBI Handbook.