989d891c0c0a500e584d55f2f368b52f2abe5f1d max Tue Apr 21 06:25:48 2026 -0700 kwanhoSv: flag as preliminary, ask users to contact authors Mark the Kim et al. 2026 PD brain long-read SV subtrack as preliminary in its shortLabel and longLabel, and add a prominent warning banner at the top of the description page telling users to contact the authors (ASAP / Kim lab) before using the data, since the callset will be updated before publication. refs #36258 diff --git src/hg/makeDb/trackDb/human/kwanhoSv.html src/hg/makeDb/trackDb/human/kwanhoSv.html index d888c3a6f52..4c641c20175 100644 --- src/hg/makeDb/trackDb/human/kwanhoSv.html +++ src/hg/makeDb/trackDb/human/kwanhoSv.html @@ -1,106 +1,115 @@ <h2>Description</h2> + +<p style="background-color:#fff3cd; border-left:4px solid #856404; padding:10px 14px; margin-bottom:18px;"> +<b>Preliminary data.</b> This callset is a pre-publication release that will be +updated before the final publication. Before using these data for analysis or +in a paper, please contact the authors at the Aligning Science Across +Parkinson's (ASAP) consortium / the Kim lab to check for the latest version +and for guidance on appropriate use. +</p> + <p> This track shows structural variants (SVs) identified by PacBio HiFi long-read whole-genome sequencing of 100 post-mortem human brain samples, split across three diagnostic groups: Parkinson's disease (PD), incidental Lewy body disease (ILBD) and healthy controls (HC). The high-confidence catalog contains 74,552 SVs: 34,056 insertions, 29,545 deletions, 9,707 duplications and 1,244 inversions. </p> <p> The dataset accompanies Kim et al. (2026), which combines the long-read SV catalog with single-nucleus RNA-seq from the same donors to identify SVs associated with cell-type-specific gene expression, including variants near genes nominated as causal targets of PD GWAS loci. </p> <h2>Display Conventions and Configuration</h2> <p> Items are colored by SV type: <ul> <li><span style="color: rgb(200,0,0);">Deletions (DEL)</span> - red</li> <li><span style="color: rgb(0,0,200);">Insertions (INS)</span> - blue</li> <li><span style="color: rgb(0,160,0);">Duplications (DUP)</span> - green</li> <li><span style="color: rgb(230,140,0);">Inversions (INV)</span> - orange</li> </ul> </p> <p> Insertions are placed at the insertion site with a width of 1 bp; deletions, duplications and inversions span the affected reference interval. Filters are available for SV type, SV length, variant quality and allele frequencies in each of the three cohorts (PD, HC, ILBD), as well as the case-minus-control carrier-rate differential. </p> <p> The detail page shows, for each variant: <ul> <li><b>Cohort support vector</b>: three-bit flag (PD/ILBD/HC) indicating which cohorts include at least one carrier.</li> <li><b>Carrier rates</b>: fraction of cases (PD+ILBD) and controls (HC) carrying the variant, and the case-minus-control differential.</li> <li><b>Per-cohort AF / AC / AN</b>: alternate allele frequency, alternate allele count, and total called alleles in PD, HC and ILBD samples.</li> <li><b>Carrier lists</b>: sample IDs carrying the variant in each cohort.</li> <li><b>Nearby SNP context</b>: number of SNPs nearby and the number in linkage disequilibrium with the SV (from the paper's LD analyses).</li> <li><b>Read support</b>: average mapping quality and average supporting reads per sample at the variant site.</li> </ul> </p> <h2>Methods</h2> <p> Long-read whole-genome sequencing was performed on 100 post-mortem brain samples (35 PD, 31 ILBD, 34 HC) with PacBio HiFi chemistry. Per-sample SV calls from multiple callers were merged into a joint callset; the high-confidence filtered catalog released in Supplementary Table 13 (<tt>media-13.txt</tt>) of the Kim et al. 2026 preprint is used directly here. Per-cohort allele frequencies, Hardy-Weinberg statistics and case / control carrier rates are reported in the source table; the track exposes the allele counts and the case-control differential as filterable fields. The paper also integrates single-nucleus RNA-seq from two brain regions of the same donors to test SV-expression associations in specific cell types, but that layer is not shown in this track. </p> <h2>Data Access</h2> <p> The data can be explored interactively in table format with the <a href="../cgi-bin/hgTables">Table Browser</a> or the <a href="../cgi-bin/hgIntegrator">Data Integrator</a>, and accessed programmatically through our <a href="https://api.genome.ucsc.edu">API</a>, track=<i>kwanhoSv</i>. </p> <p> The bigBed is available from <a href="http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/" target="_blank">our download server</a> as <tt>kwanho.bb</tt>. Example: <tt>bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/lrSv/kwanho.bb -chrom=chr21 -start=0 -end=100000000 stdout</tt>. </p> <p> The full supplementary data for the paper (including <tt>media-13.txt</tt>) is available from the Kim et al. 2026 preprint. </p> <h2>Credits</h2> <p> Thanks to Kim, Levin and colleagues at the Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, the Broad Institute, Yale School of Medicine, Banner Sun Health Research Institute and their collaborators for releasing this dataset. </p> <h2>References</h2> <p> Kim K, Lin Z, Simmons SK, Parker J, Kearney M, Liao Z, Haywood N, Zhang J, Cline MP, Tuncali I <em>et al</em>. <a href="https://doi.org/10.64898/2026.03.20.713192" target="_blank"> Integrating Long-Read Structural Variant Analysis with single-nucleus RNA-seq to Elucidate Gene Expression Effects in Disease</a>. <em>bioRxiv</em>. 2026 Mar 23;. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/41929179" target="_blank">41929179</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13041997/" target="_blank">PMC13041997</a> </p>