ef61e73fc416622d8557ec2439df2344a1cc80c3 max Tue Jun 9 15:10:01 2026 -0700 lrSv: replace HPRC v2.0 pangenome SV track with v2.1 (hprc2v21Sv) Drop the v2.0 wave-decomposed hprc2Sv track and add hprc2v21Sv built from the HPRC v2.1 minigraph-cactus raw vg deconstruct VCFs (gref95.ro), on both hg38 (GRCh38 path, 596,063 SVs) and hs1 (T2T-CHM13 path, 608,435 SVs). The v2.1 files lack per-allele TYPE/LEN, so the new converter classifies INS/DEL by parsimony-trimming REF/ALT and the net length change. The v2.0 build recipe, converter and schema are kept but commented out in the makeDocs in case wave-decomposed VCFs are released again, refs #36258 diff --git src/hg/makeDb/trackDb/human/hprc2Sv.html src/hg/makeDb/trackDb/human/hprc2Sv.html deleted file mode 100644 index 11a002d914b..00000000000 --- src/hg/makeDb/trackDb/human/hprc2Sv.html +++ /dev/null @@ -1,118 +0,0 @@ -

Description

-

-This track shows structural variants (SVs) derived from the Human Pangenome -Reference Consortium (HPRC) release-2 pangenome graph. The graph was built -with minigraph-cactus from PacBio HiFi haplotype-resolved assemblies of 233 -samples (including T2T-CHM13 and the diverse 1000 Genomes Project sample -set). HPRC releases one VCF per reference path (GRCh38 and T2T-CHM13); -we display both natively on the corresponding UCSC assembly (hg38 and hs1). -Variants were extracted from the graph with vg deconstruct and -decomposed into atomic alleles with vcfwave (WFA2-lib). -

-

-The hg38 track contains 1,483,114 SV-sized alleles (length ≥ 50 bp) split -by type: 1,106,190 insertions, 192,597 deletions, 178,178 complex alleles -and 6,149 inversions. The hs1 track is built from the parallel T2T-CHM13 -wave VCF. Each row carries the allele count, allele frequency, number of -samples with data and the snarl-nesting level of the variant in the -pangenome decomposition tree. -

- -

Display Conventions and Configuration

-

-Items are colored by SV type: -

-

-

-Insertions are placed at the insertion site with a width of 1 bp; deletions, -complex alleles and inversions span the affected reference interval. -Filters are available for SV type, SV length, allele frequency and snarl -level (0 = top-level bubble; higher values are nested within parent -bubbles). -

- -

Methods

-

-HPRC release-2 is an open data release (not yet accompanied by a formal -peer-reviewed publication) built from PacBio HiFi haplotype-resolved -assemblies of 233 samples, including T2T-CHM13 and a diverse 1000 Genomes -Project panel. The pangenome graph was built with Minigraph-Cactus against -both GRCh38 and T2T-CHM13 reference paths; variants were extracted from -the graph with vg deconstruct and then decomposed into atomic -alleles with vcfwave / WFA2-lib, yielding per-allele TYPE and LEN -fields. For this track, each ALT in the wave VCF was emitted as its own -BED row, retaining alleles with |LEN| ≥ 50 bp or the INV flag; -allele counts, frequencies, sample counts and snarl levels are taken -directly from the per-allele INFO fields. On hg38 this yields 1,483,114 -SV-sized alleles (1,106,190 insertions, 192,597 deletions, 178,178 complex -alleles and 6,149 inversions); the hs1 track is built from the parallel -T2T-CHM13 wave VCF. Sample-list and assembly provenance for the graph are -maintained at HPRC in - -hprc_intermediate_assembly/alignments_v2.0.csv. -

-

-The HPRC v2.0 Minigraph-Cactus graph and wave-decomposed VCFs were -downloaded from the HPRC S3 release bucket: - -hprc-v2.0-mc-grch38.wave.vcf.gz (hg38) and - -hprc-v2.0-mc-chm13.wave.vcf.gz (hs1). -

-

-The step-by-step build commands (download, format conversion, bigBed build) -are recorded in the UCSC makeDoc for this track container: - -doc/hg38/lrSv.txt and - -doc/hs1/lrSv.txt. The conversion scripts and autoSql schemas live in - -makeDb/scripts/lrSv. -

- -

Data Access

-

-The data can be explored interactively in table format with the -Table Browser or the -Data Integrator, and accessed -programmatically through our API, -track=hprc2Sv. -

-

-The bigBed is available from our download server for both assemblies: -

-Example: bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/hprc2.bb -chrom=chr21 -start=0 -end=100000000 stdout. -

-

-The original pangenome graph and the wave-decomposed VCF are available -from the HPRC public S3 bucket, as linked from the -HPRC -release-2 announcement. -

- -

Credits

-

-Thanks to the Human Pangenome Reference Consortium for building and -publicly releasing the release-2 minigraph-cactus pangenome. -

- -

References

-

-HPRC release-2 data is not yet described in a formal peer-reviewed -publication. See the Human Pangenome Project release announcement -for background and data-access details: - -HPRC data release 2. -