dc1e0e76dbe49861bd0ebe8db64e27f587737794 max Mon Mar 30 15:40:03 2026 -0700 adding two more phased variants tracks, refs #37306 diff --git src/hg/makeDb/trackDb/human/han945SvVcf.html src/hg/makeDb/trackDb/human/han945SvVcf.html new file mode 100644 index 00000000000..a24eed8e0a4 --- /dev/null +++ src/hg/makeDb/trackDb/human/han945SvVcf.html @@ -0,0 +1,64 @@ +

Description

+

+This track shows per-sample genotypes for 111,288 structural variants (SVs) +from 945 Han Chinese individuals, displayed as a VCF track. It is a companion +to the Han 945 SVs bigBed track, which +shows the same variants with summary statistics and filters. +

+

+The VCF format allows the genome browser to display a genotype matrix showing +which of the 945 individuals carry each structural variant. +

+ +

Display Conventions and Configuration

+

+Each variant is shown with per-sample genotypes: 0/1 indicates the sample +carries the SV, 0/0 indicates it does not. The genotype coloring follows +standard VCF display conventions. +

+

+Samples are labeled Sample_001 through Sample_945, as the original data +release does not include individual sample identifiers. +

+ +

Methods

+

+The original VCF from Gong et al. is a site-only file (no sample columns) +produced by merging per-sample SV calls with SURVIVOR v1.0.6. SURVIVOR +records which samples support each SV in the INFO/SUPP_VEC field — +a binary string of length 945, where each position represents one sample +and '1' indicates that sample's caller reported the SV. +

+

+To reconstruct per-sample genotypes, the SUPP_VEC was expanded into 945 +sample columns with a GT (genotype) FORMAT field. Samples with a '1' in +SUPP_VEC were assigned genotype 0/1 (heterozygous carrier); samples with +'0' were assigned 0/0 (homozygous reference). This is a simplification: +the original per-sample callers may have reported homozygous alternate (1/1) +genotypes for some individuals, but this information is not preserved in the +SURVIVOR merge. The conversion was performed with the script +lrSvHan945SuppVecToVcf.py. +

+ +

Data Access

+

+The source VCF was downloaded from the OMIX repository (accession OED00945268) +at the National Genomics Data Center (NGDC). +

+ +

Credits

+

+Thanks to Gong et al. for making their structural variant calls publicly available. +

+ +

References

+ +

+Gong J, Sun H, Wang K, Zhao Y, Huang Y, Chen Q, Qiao H, Gao Y, Zhao J, Ling Y et al. + +Long-read sequencing of 945 Han individuals identifies structural variants associated with +phenotypic diversity and disease susceptibility. +Nat Commun. 2025 Feb 10;16(1):1494. +PMID: 39929826; PMC: PMC11811171 +