6b0d68657267f1e02c47d4224ea62446bbbb2ba0 max Fri May 22 06:55:52 2026 -0700 small non-AI changes to the html docs pages of the long-read SV tracks diff --git src/hg/makeDb/trackDb/human/aprSv.html src/hg/makeDb/trackDb/human/aprSv.html index ac90c471788..107e2e2ef62 100644 --- src/hg/makeDb/trackDb/human/aprSv.html +++ src/hg/makeDb/trackDb/human/aprSv.html @@ -1,62 +1,62 @@ <h2>Description</h2> <p> -This track displays structural variants (SVs) — deletions, insertions, and -complex substitutions of at least 50 bp — from the Arabic Pangenome +This track displays structural variants (SVs), at least 50 bp long +(deletions, insertions, and complex substitutions), from the Arabic Pangenome Reference (APR), a pangenome graph built from 53 UAE-resident Arab individuals drawn from eight countries (UAE, Saudi Arabia, Oman, Jordan, Egypt, Morocco, Syria, Yemen). Each bubble in the graph that contains an SV-sized alternative allele is shown as a single variant site, with allele counts aggregated across the 53 samples (the GRCh38 reference haplotype, present as an extra sample column in the source VCF, is excluded from the aggregation).</p> <p> The APR pangenome was built on the T2T-CHM13v2 reference. Variants are shown natively on the <b>hs1</b> browser and lifted to <b>hg38</b> using the UCSC <tt>hs1ToHg38.over.chain.gz</tt> chain; variants that do not lift cleanly (often in T2T-added euchromatic sequence) are omitted from the hg38 version of the track.</p> -<h2>Display conventions</h2> +<h2>Display Conventions and Configuration</h2> <p>Items are colored by SV type:</p> <ul> <li><span style="background-color:rgb(0,0,200);color:white;padding:1px 6px">INS</span> insertion (net ALT longer by ≥50 bp)</li> <li><span style="background-color:rgb(200,0,0);color:white;padding:1px 6px">DEL</span> deletion (net REF longer by ≥50 bp)</li> <li><span style="background-color:rgb(230,140,0);color:white;padding:1px 6px">CPX</span> complex substitution (similar-length REF and ALT but at least one ≥50 bp)</li> <li><span style="background-color:rgb(120,120,120);color:white;padding:1px 6px">MIXED</span> snarl whose alt alleles belong to different classes</li> </ul> <p>Each item spans from the start of REF to its end on the reference. The name field is the graph snarl ID (e.g. <tt><951452<1012008</tt>), which identifies the variant site in the APR pangenome graph.</p> -<h2>Per-site alt-allele aggregation</h2> +<h2>Per-site Alt-allele Aggregation</h2> <p> The source VCF is multi-allelic: a single graph snarl appears as one row with a comma-separated ALT list. For this track, each ALT is classified individually using the 50 bp threshold, and the row is emitted as a single bed item with:</p> <ul> - <li><b>svType</b> — the common class, or <tt>MIXED</tt> if alts disagree;</li> - <li><b>svLen</b> — reference span (chromEnd - chromStart);</li> - <li><b>insLen</b> — maximum inserted-sequence length across passing INS alts (0 otherwise);</li> - <li><b>AC</b> — sum of per-alt allele counts (AC) that passed;</li> - <li><b>numAlts</b> — number of alt alleles that passed the 50 bp filter.</li> + <li><b>svType</b>: the common class, or <tt>MIXED</tt> if alts disagree;</li> + <li><b>svLen</b>: reference span (chromEnd - chromStart);</li> + <li><b>insLen</b>: maximum inserted-sequence length across passing INS alts (0 otherwise);</li> + <li><b>AC</b>: sum of per-alt allele counts (AC) that passed;</li> + <li><b>numAlts</b>: number of alt alleles that passed the 50 bp filter.</li> </ul> <p>Rows whose alts are all smaller than 50 bp are not shown.</p> <h2>Methods</h2> <p> Nassir et al. 2025 built the Arabic Pangenome Reference (APR) from 53 UAE-resident Arab individuals drawn from eight countries, sequenced with ~35x PacBio HiFi on Sequel IIe/Revio (30-h movies), ~54x Oxford Nanopore ultralong reads on R10.4.1 PromethION flow cells (96-h runs), and ~65x Hi-C (Illumina NovaSeq 6000). Haplotype-phased de novo assemblies were produced with hifiasm v0.19.5 (primary) and Verkko v1.3.1 (for comparison), with a median N50 of 124 Mb. The pangenome graph was built with Minigraph-Cactus seeded on T2T-CHM13v2 and augmented with GRCh38, and SVs were extracted by graph deconstruction. The released decomposed