06a482a2120d4d85c7c34fb5038213e07f595554
max
Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context)
-Oxford Nanopore sequencing was performed on genomic DNA extracted from activated -T lymphocytes of 333 individuals (111 trios) from the Tohoku Medical Megabank -(ToMMo) cohort. SV calling was performed with Sniffles on each sample, and -calls were merged across individuals with SURVIVOR v1.0.6 using a maximum -distance of 1 kbp. Allele frequencies were computed from 222 unrelated parents -(excluding offspring to avoid double-counting). Mendelian error rates were -calculated by checking transmission consistency within each trio family. +Otsuki et al. 2022 extracted high-molecular-weight genomic DNA from activated +T lymphocytes of 333 individuals (111 parent-offspring trios) from the Tohoku +Medical Megabank (ToMMo) BirThree cohort and performed Oxford Nanopore +whole-genome sequencing on PromethION instruments with R9.4.1 flow cells +(SQK-LSK109 libraries, Guppy v4.2.2 high-accuracy base-calling). After QC, +median per-sample sequencing coverage was 22.2x with a read N50 of 25.8 kb. +Reads were aligned to GRCh38 with LRA, SVs were called per sample with +CuteSV +v1.0.9 (-min_sv_length 50), and per-sample calls were merged with +SURVIVOR +v1.0.6 (1000 bp distance, type-match, no length-match) into a nonredundant +panel of 74,201 autosomal SVs (37,981 deletions and 36,220 insertions). +Over 95% of the SVs were concordant with Mendelian inheritance in the 111 +trio families; allele frequencies in this track are computed from the 222 +unrelated parents to avoid double-counting. +
++The site-only VCF tommo-JSV1-20211208-GRCh38-without-genotype-count.vcf.gz +was downloaded from the jMorp JSV1 dataset page, + +tommo-jsv1-20211208-af. +
++The step-by-step build commands (download, format conversion, bigBed build) +are recorded in the UCSC makeDoc for this track container: + +doc/hg38/lrSv.txt. The conversion scripts and autoSql schemas live in + +makeDb/scripts/lrSv.
Source data is available from the jMorp downloads page (ToMMo Japanese Multi Omics Reference Panel).
Thanks to the Tohoku Medical Megabank Organization for making their structural variant calls publicly available through the jMorp data portal.