bac95a147f49cd331052e597006e04b3deee40fc
max
  Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups

Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.

Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.

refs #36258

diff --git src/hg/makeDb/trackDb/human/tommoJpSv.html src/hg/makeDb/trackDb/human/tommoJpSv.html
index 10c3117337e..91d181cf572 100644
--- src/hg/makeDb/trackDb/human/tommoJpSv.html
+++ src/hg/makeDb/trackDb/human/tommoJpSv.html
@@ -1,103 +1,132 @@
 <h2>Description</h2>
 <p>
 This track shows structural variants (SVs) identified by Oxford Nanopore long-read
 sequencing of 333 Japanese individuals from the Tohoku Medical Megabank (ToMMo)
 project. The 333 individuals form 111 parent-offspring trios, enabling
 Mendelian consistency checks on the SV calls. Activated T lymphocytes were used
 as a source of high-molecular-weight DNA for nanopore sequencing at a median
 coverage of 22.2x with an N50 read length of 25.8 kb.
 </p>
 <p>
 The dataset contains 74,201 SVs (37,981 deletions and 36,220 insertions),
 merged across individuals using SURVIVOR v1.0.6. Over 95% of the SVs are
 concordant with Mendelian inheritance in the trio families.
 </p>
 
 <h2>Display Conventions and Configuration</h2>
 <p>
 Items are colored by SV type:
 <ul>
 <li><span style="color: rgb(200,0,0);">Deletions (DEL)</span> - red</li>
 <li><span style="color: rgb(0,0,200);">Insertions (INS)</span> - blue</li>
 </ul>
 </p>
 <p>
 Filters are available for SV type, SV length, and allele frequency.
 For insertions, the item is placed at the insertion site with a width of 1 bp;
 for deletions, the item spans the deleted region.
 </p>
 <p>
 The detail page for each item shows:
 <ul>
 <li><b>Allele Frequency</b>: fraction of alleles carrying this variant
 (based on 444 alleles from 222 unrelated parents)</li>
 <li><b>Allele Count / Allele Number</b>: number of variant alleles and
 total alleles genotyped</li>
 <li><b>Mendelian Error Rate</b>: fraction of trio families showing
 inheritance errors for this variant</li>
 <li><b>Families with Errors / Families Genotyped</b>: number of families
 with Mendelian errors and total families with complete genotype calls</li>
 </ul>
 </p>
 
 <h2>Methods</h2>
 <p>
 Otsuki et al. 2022 extracted high-molecular-weight genomic DNA from activated
 T lymphocytes of 333 individuals (111 parent-offspring trios) from the Tohoku
 Medical Megabank (ToMMo) BirThree cohort and performed Oxford Nanopore
 whole-genome sequencing on PromethION instruments with R9.4.1 flow cells
 (SQK-LSK109 libraries, Guppy v4.2.2 high-accuracy base-calling). After QC,
 median per-sample sequencing coverage was 22.2x with a read N50 of 25.8 kb.
 Reads were aligned to GRCh38 with LRA, SVs were called per sample with
 <a href="https://github.com/tjiangHIT/cuteSV" target="_blank">CuteSV</a>
 v1.0.9 (<tt>-min_sv_length 50</tt>), and per-sample calls were merged with
 <a href="https://github.com/fritzsedlazeck/SURVIVOR" target="_blank">SURVIVOR</a>
 v1.0.6 (1000 bp distance, type-match, no length-match) into a nonredundant
 panel of 74,201 autosomal SVs (37,981 deletions and 36,220 insertions).
 Over 95% of the SVs were concordant with Mendelian inheritance in the 111
 trio families; allele frequencies in this track are computed from the 222
 unrelated parents to avoid double-counting.
 </p>
 <p>
 The site-only VCF <tt>tommo-JSV1-20211208-GRCh38-without-genotype-count.vcf.gz</tt>
-was downloaded from the jMorp JSV1 dataset page,
-<a href="https://jmorp.megabank.tohoku.ac.jp/datasets/tommo-jsv1-20211208-af" target="_blank">
+was downloaded from the jMorp JSV1 download page,
+<a href="https://jmorp.megabank.tohoku.ac.jp/downloads/tommo-jsv1-20211208-af" target="_blank">
 tommo-jsv1-20211208-af</a>.
 </p>
 <p>
 The step-by-step build commands (download, format conversion, bigBed build)
 are recorded in the UCSC makeDoc for this track container:
 <a href="https://github.com/ucscGenomeBrowser/kent/blob/master/src/hg/makeDb/doc/hg38/lrSv.txt" target="_blank">
 doc/hg38/lrSv.txt</a>. The conversion scripts and autoSql schemas live in
 <a href="https://github.com/ucscGenomeBrowser/kent/tree/master/src/hg/makeDb/scripts/lrSv" target="_blank">
 makeDb/scripts/lrSv</a>.
 </p>
 
 <h2>Data Access</h2>
 <p>
 Source data is available from the
-<a href="https://jmorp.megabank.tohoku.ac.jp/downloads"
-   target="_blank">jMorp downloads page</a> (ToMMo Japanese Multi Omics Reference Panel).
+<a href="https://jmorp.megabank.tohoku.ac.jp/downloads/tommo-jsv1-20211208-af"
+   target="_blank">tommo-jsv1-20211208-af download page</a> on the jMorp
+portal (ToMMo Japanese Multi Omics Reference Panel).
+</p>
+
+<h2>Conditions of Use</h2>
+<p>
+The information in the ToMMo jMorp database is provided only to persons
+who agree to jMorp's
+<a href="https://jmorp.megabank.tohoku.ac.jp/help/conditions-of-use" target="_blank">
+Conditions of Use</a>. By using these data, you are deemed to have read
+and understood those conditions and to agree to the following obligations:
+<ul>
+<li>Do not attempt to identify or contact any person who provided specimens
+used to construct the information.</li>
+<li>Request permission from dist [AT] megabank [DOT] tohoku [DOT] ac [DOT] jp
+prior to using the data for commercial purposes.</li>
+<li>Notify dist [AT] megabank [DOT] tohoku [DOT] ac [DOT] jp when providing
+re-edited data to any third party.</li>
+<li>Cite the jMorp paper in publications that report analyses based on
+these data: Tadaka S, Kawashima J, Hishinuma E, <i>et al.</i>,
+&ldquo;jMorp: Japanese Multi-Omics Reference Panel update report 2023&rdquo;,
+<i>Nucleic Acids Research</i>, 2023 Nov 1,
+<a href="https://doi.org/10.1093/nar/gkad978" target="_blank">doi:10.1093/nar/gkad978</a>;
+please also refer to the per-dataset citation notes linked from that page.</li>
+</ul>
+The copyright in the information and the database is owned by ToMMo. If a
+dataset-specific contact or Data Transfer Agreement is attached to a given
+dataset, those should be followed in preference to this generic page.
+Questions should be directed to
+tommo-jmorp [AT] grp [DOT] tohoku [DOT] ac [DOT] jp.
 </p>
 
 <h2>Credits</h2>
 <p>
 Thanks to the Tohoku Medical Megabank Organization for making their structural
 variant calls publicly available through the jMorp data portal.
 </p>
 
 <h2>References</h2>
 
 
 
 <p>
 Otsuki A, Okamura Y, Ishida N, Tadaka S, Takayama J, Kumada K, Kawashima J, Taguchi K, Minegishi N,
 Kuriyama S <em>et al</em>.
 <a href="https://doi.org/10.1038/s42003-022-03953-1" target="_blank">
 Construction of a trio-based structural variation panel utilizing activated T lymphocytes and long-
 read sequencing technology</a>.
 <em>Commun Biol</em>. 2022 Sep 20;5(1):991.
 PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/36127505" target="_blank">36127505</a>; PMC: <a
 href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9489684/" target="_blank">PMC9489684</a>
 </p>