All File Changes
v497_base to v498_preview (2026-04-20 to 2026-04-27) v498
Show details
- confs/hgwbeta.hg.conf
- lines changed 6, context: html, text, full: html, text
27d5efcd5e56f195907b370dbd8b2e151c700a64 Sun Apr 26 01:11:17 2026 -0700
Installing updated hg.conf files from UCSC servers
- confs/hgwdev.hg.conf
- lines changed 9, context: html, text, full: html, text
27d5efcd5e56f195907b370dbd8b2e151c700a64 Sun Apr 26 01:11:17 2026 -0700
Installing updated hg.conf files from UCSC servers
- src/hg/hgGene/hgGene.c
- lines changed 9, context: html, text, full: html, text
3670b9f2a58342dcc0629bad9d221a9d5e4ae5ff Thu Apr 23 14:39:37 2026 -0700
hgGene: render all sections and the Methods page for QuickLift knownGene clicks. refs #36370
- synonym.c: wrap the coord-qualified kgProteinID lookup in emptyForNull.
After quickLiftGenePred() rewrites curGeneChrom/Start/End to destination
coords, the same-transaction lookup against the source assembly's
knownGene table misses and sqlGetField returns NULL; strstr(NULL, "-")
on the next line segfaulted, truncating the page at <B>Protein: </B>
so GeneReviews, Methods, and any later sections never rendered.
- hgGene.c doKgMethod: when the track is quickLifted and the hub trackDb
has no html, refetch the trackDb from quickLiftDb using
trackHubSkipHubName(tableName). The Methods "Click here" link was
landing on a page whose entire body was printf("%s", NULL) => "(null)".
Verified on hgwdev-braney with Gerardo's repro (hg38 -> hs1 QuickLift,
chr7:156982676-156996015 window, MNX1 / ENST00000469500.5): all 13
sections now render and Methods serves the full GENCODE description.
No change on the native hg38 path.
- src/hg/hgGene/synonym.c
- lines changed 1, context: html, text, full: html, text
3670b9f2a58342dcc0629bad9d221a9d5e4ae5ff Thu Apr 23 14:39:37 2026 -0700
hgGene: render all sections and the Methods page for QuickLift knownGene clicks. refs #36370
- synonym.c: wrap the coord-qualified kgProteinID lookup in emptyForNull.
After quickLiftGenePred() rewrites curGeneChrom/Start/End to destination
coords, the same-transaction lookup against the source assembly's
knownGene table misses and sqlGetField returns NULL; strstr(NULL, "-")
on the next line segfaulted, truncating the page at <B>Protein: </B>
so GeneReviews, Methods, and any later sections never rendered.
- hgGene.c doKgMethod: when the track is quickLifted and the hub trackDb
has no html, refetch the trackDb from quickLiftDb using
trackHubSkipHubName(tableName). The Methods "Click here" link was
landing on a page whose entire body was printf("%s", NULL) => "(null)".
Verified on hgwdev-braney with Gerardo's repro (hg38 -> hs1 QuickLift,
chr7:156982676-156996015 window, MNX1 / ENST00000469500.5): all 13
sections now render and Methods serves the full GENCODE description.
No change on the native hg38 path.
- src/hg/hgHubConnect/hooks/pre-finish.c
- lines changed 50, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/hgHubConnect/trackHubWizard.c
- lines changed 1, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/hgTrackUi/hgTrackUi.c
- lines changed 17, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- lines changed 21, context: html, text, full: html, text
66ea6cb4eaf2036e464be55b295765ed5105a0fb Wed Apr 22 09:42:25 2026 -0700
hgTrackUi/hui: render filter UI on supertrack configuration pages
Supertracks (group of tracks with superTrack on, no data of their own)
previously had no way to expose a shared filter: their trackDb stanza
can declare filter.* / filterByRange.* / filterValues.*, but those
settings were never drawn on the supertrack's hgTrackUi page. So users
had to open each subtrack's own configuration page and set the same
filter there, and the "lrSv.filter.svLen" cart namespace went unused.
This change wires that up:
- hgTrackUi.c (superTrackUi): after listing the subtracks, if the
supertrack tdb declares any filter.* settings call scoreCfgUi() to
render the standard filter UI. Cart variables land under the
supertrack's own name (e.g. "lrSv.filter.svLen.min"), and subtracks
already inherit them through cartOptionalStringClosestToHome()
walking tdb->parent. Subtrack-level values continue to override.
- hui.c:
- buildFilterBy() / filterByValues(): tolerate a NULL autoSql object,
so supertracks (which have no data table) don't errAbort when they
declare filterValues.* of virtual aggregated fields. Missing-field
errAbort still fires in the normal subtrack case.
- scoreCfgUi() / cfgByCfgType(): when called with title == NULL (the
supertrack filter path), suppress the default "<p>" separator and
the "<BR>" between the title bar and the filter block; the caller
renders its own section heading.
- asForTdb(): handle conn == NULL by returning NULL rather than
crashing, since supertrack filter rendering has no associated
sqlConnection.
refs #37426
- src/hg/hgTracks/simpleTracks.c
- lines changed 1, context: html, text, full: html, text
8c8823d9a87b6046094577a808dda4ed4172b416 Thu Apr 23 11:09:48 2026 -0700
Make quickLifted GENCODE tracks usable: click details, position links, coloring. refs #36059
- Re-vet wgEncodeGencode* in trackHub.c:isVetted() so the quickLift hub
no longer emits `avoidHandler on` for GENCODE children. Without this,
hgc bypasses findNameBasedHandler and doGencodeGene is never reached,
so the details page falls through to the generic showGenePos output.
- In gencodeClick.c doGencodeGene/helpers, use quickLiftDb for the
MySQL connection (the attrs/tag/pubMed/refSeq/uniProt tables live in
the source assembly), use trackHubSkipHubName(tdb->track) for the
genePred table name and the Basic/Comp/PseudoGene/PolyA prefix
dispatch checks, and teach isGrcHuman/isGrcH37Native to consult the
source db — otherwise the page errAborted with
"BUG: gencodeClick on wrong database: hub_NNNNN_hs1".
- Lift the Position values (transcript + gene bounds, 2-way pseudo,
PolyA) through the quickLift chain before printing, so the details
page shows destination-assembly coords and the Position link lands
at a window that actually contains the gene.
- htcDnaNearGene (hgc.c): when quickLifted, lift seqName/winStart/winEnd
back to source coords before hgSeqItemsInRange. The prior code
pointed the query at the source-assembly table but still passed the
destination window, so the "Genomic Sequence from assembly" flow
returned "No results returned from query." No longer reachable from
the GENCODE details page (doGencodeGene writes its own Sequences
table) but still hit by other quickLifted genePred tracks.
- simpleTracks.c genePredItemClassColor: the hTableExists() guard was
checking `database` (the destination hub db like hub_NNNNN_hs1) even
though the MySQL conn was already routed to liftDb; the lookup was
skipped and every item fell back to tg->ixColor, rendering all
black. Check `db` instead so the gClass_* palette (coding / nonCoding
/ pseudo / problem) is applied.
Verified end-to-end on hgwdev-braney: Gerardo's SHH test case
(ENST00000297261.7) and Basic/Comp/PseudoGene/PolyA subtracks; no
regression on the non-quickLifted hg38 click.
- lines changed 4, context: html, text, full: html, text
5fa73cd9b411b3147227a4a31f6df7b24638dd00 Fri Apr 24 05:37:11 2026 -0700
adding exon length to mouseover, refs #37439
- src/hg/hgc/bigBedClick.c
- lines changed 1, context: html, text, full: html, text
c13d7b1af23d1e1a6f015953536e2d74a467fb16 Wed Apr 22 17:25:40 2026 -0700
hgc bigBed click: skip intervals that fail to quickLift remap instead of errAborting with an out-of-bounds read. The errAbort message printed fields[3] on a stack-allocated fields[bedSize+seq1Seq2Fields]; for a bigBed3 (e.g. GIAB problematicRegions) that's one past the end, producing garbled binary text in the warning dialog. More importantly, for bedSize==3 there is no name filter before the remap, so every interval in the window was remapped and any single failure aborted the whole page. Match the hgTracks behavior (bigBedTrack.c: continue on NULL) so unmappable items are silently dropped and the clicked item still renders. refs #36335
- src/hg/hgc/gencodeClick.c
- lines changed 64, context: html, text, full: html, text
8c8823d9a87b6046094577a808dda4ed4172b416 Thu Apr 23 11:09:48 2026 -0700
Make quickLifted GENCODE tracks usable: click details, position links, coloring. refs #36059
- Re-vet wgEncodeGencode* in trackHub.c:isVetted() so the quickLift hub
no longer emits `avoidHandler on` for GENCODE children. Without this,
hgc bypasses findNameBasedHandler and doGencodeGene is never reached,
so the details page falls through to the generic showGenePos output.
- In gencodeClick.c doGencodeGene/helpers, use quickLiftDb for the
MySQL connection (the attrs/tag/pubMed/refSeq/uniProt tables live in
the source assembly), use trackHubSkipHubName(tdb->track) for the
genePred table name and the Basic/Comp/PseudoGene/PolyA prefix
dispatch checks, and teach isGrcHuman/isGrcH37Native to consult the
source db — otherwise the page errAborted with
"BUG: gencodeClick on wrong database: hub_NNNNN_hs1".
- Lift the Position values (transcript + gene bounds, 2-way pseudo,
PolyA) through the quickLift chain before printing, so the details
page shows destination-assembly coords and the Position link lands
at a window that actually contains the gene.
- htcDnaNearGene (hgc.c): when quickLifted, lift seqName/winStart/winEnd
back to source coords before hgSeqItemsInRange. The prior code
pointed the query at the source-assembly table but still passed the
destination window, so the "Genomic Sequence from assembly" flow
returned "No results returned from query." No longer reachable from
the GENCODE details page (doGencodeGene writes its own Sequences
table) but still hit by other quickLifted genePred tracks.
- simpleTracks.c genePredItemClassColor: the hTableExists() guard was
checking `database` (the destination hub db like hub_NNNNN_hs1) even
though the MySQL conn was already routed to liftDb; the lookup was
skipped and every item fell back to tg->ixColor, rendering all
black. Check `db` instead so the gClass_* palette (coding / nonCoding
/ pseudo / problem) is applied.
Verified end-to-end on hgwdev-braney: Gerardo's SHH test case
(ENST00000297261.7) and Basic/Comp/PseudoGene/PolyA subtracks; no
regression on the non-quickLifted hg38 click.
- src/hg/hgc/geneReviewsClick.c
- lines changed 6, context: html, text, full: html, text
7120a354ec4870b3bc2112c384bdb0af837ed44e Thu Apr 23 14:12:36 2026 -0700
hgc: make quickLifted NCBI RefSeq / UCSC RefSeq click details work. refs #36125
- trackHub.c isVetted(): accept refGene, ncbiRefSeq* subtracks, and ncbiOrtho
so the generated quickLift hub.txt doesn't emit 'avoidHandler on' for them.
Without this, hgc skips findNameBasedHandler and the detail click falls
through to genericClickHandler, showing none of the rich RefSeq fields.
- hgc.c findNameBasedHandler: when the tdb has 'quickLifted on', strip the
hub_NNN_ prefix from the dispatch table before comparisons so refGene /
ncbiRefSeq* match their native handlers.
- doNcbiRefSeq / doRefGene: route hAllocConn to quickLiftDb (source), use
trackHubSkipHubName for track/table name comparisons, and pass srcDb into
replaceInUrl, hTableExists, AceView / MGIid / jaxOrtholog / hg-prefix
checks -- otherwise these paths errAborted with "Unknown database
hub_NNN_<db>" on a quickLifted click.
- printRefSeqInfo / prRefGeneInfo / gbCdnaGetVersion: use sqlGetDatabase(conn)
rather than the global `database` for host-db checks.
- Predicted Protein / Predicted mRNA links for quickLifted tracks now point
at htcTranslatedPredMRna / htcGeneMrna so the sequences come from the
destination genome at the lifted exon coordinates, rather than the
NCBI-authored refPep / seqNcbiRefSeq extFiles on the source. Native
behavior is unchanged.
- showGenePos: fetch via quickLiftSql + calcLiftOverGenePreds when the track
is quickLifted so the Position line is in destination coordinates. The
swapped chainHash comes from the track's quickLiftUrl bigChain, matching
the pattern getGenePredForPositionSql already uses.
- Suppress the mRNA/Genomic Alignments block and its trailing <hr> for
quickLifted refGene / ncbiRefSeq* clicks. The PSLs come back in source
coords and the htcCdnaAli links don't line up with the destination window.
- getAlignmentsTName: use sqlGetDatabase(conn) rather than global `database`
so split-table resolution looks up the right db.
- printGeneCards: take a db argument; callers pass srcDb so the GeneCards
link renders on quickLifted pages (the "startsWith hg" check was matching
hub_NNN_hs1 as false).
- geneReviewsClick.c prGRShortRefGene: take a struct sqlConnection * rather
than opening its own on `database` (hub_NNN_hs1 -> Unknown database abort).
Callers in prRefGeneInfo and doOmimGene2 pass their existing source-db
conn, so the Related GeneReviews line now shows on both native and
quickLifted refGene pages.
Verified end-to-end on hgwdev-braney with hg38 -> hs1 quickLift of the
refSeqComposite: ncbiRefSeqCurated and refGene clicks on SHH now render
the full doNcbiRefSeq / doRefGene details (RefSeq / Status / Description
/ Synonyms / OMIM / Protein / HGNC / Entrez Gene / GeneCards / AceView /
Summary / Position / Gene Symbol); Predicted mRNA and Predicted Protein
links return destination-derived sequences; Genomic Sequence reaches the
Get-DNA-in-window page; Ctrl/Cmd+drag zoom no longer ends up with an
HTML error page inside the zoom dialog. Non-quickLifted hg38 refSeq
details are unchanged (alignments still shown, GeneReviews still present).
The duplicate-subtracks symptom from the ticket's step (i) is not
addressed here -- it's an architectural consequence of the shared
`refSeqComposite` cart key applying to both the destination's native
composite and the quickLift hub's composite, and needs cart-scoping
work in quickLift v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/hgc/hgc.c
- lines changed 11, context: html, text, full: html, text
8c8823d9a87b6046094577a808dda4ed4172b416 Thu Apr 23 11:09:48 2026 -0700
Make quickLifted GENCODE tracks usable: click details, position links, coloring. refs #36059
- Re-vet wgEncodeGencode* in trackHub.c:isVetted() so the quickLift hub
no longer emits `avoidHandler on` for GENCODE children. Without this,
hgc bypasses findNameBasedHandler and doGencodeGene is never reached,
so the details page falls through to the generic showGenePos output.
- In gencodeClick.c doGencodeGene/helpers, use quickLiftDb for the
MySQL connection (the attrs/tag/pubMed/refSeq/uniProt tables live in
the source assembly), use trackHubSkipHubName(tdb->track) for the
genePred table name and the Basic/Comp/PseudoGene/PolyA prefix
dispatch checks, and teach isGrcHuman/isGrcH37Native to consult the
source db — otherwise the page errAborted with
"BUG: gencodeClick on wrong database: hub_NNNNN_hs1".
- Lift the Position values (transcript + gene bounds, 2-way pseudo,
PolyA) through the quickLift chain before printing, so the details
page shows destination-assembly coords and the Position link lands
at a window that actually contains the gene.
- htcDnaNearGene (hgc.c): when quickLifted, lift seqName/winStart/winEnd
back to source coords before hgSeqItemsInRange. The prior code
pointed the query at the source-assembly table but still passed the
destination window, so the "Genomic Sequence from assembly" flow
returned "No results returned from query." No longer reachable from
the GENCODE details page (doGencodeGene writes its own Sequences
table) but still hit by other quickLifted genePred tracks.
- simpleTracks.c genePredItemClassColor: the hTableExists() guard was
checking `database` (the destination hub db like hub_NNNNN_hs1) even
though the MySQL conn was already routed to liftDb; the lookup was
skipped and every item fell back to tg->ixColor, rendering all
black. Check `db` instead so the gClass_* palette (coding / nonCoding
/ pseudo / problem) is applied.
Verified end-to-end on hgwdev-braney: Gerardo's SHH test case
(ENST00000297261.7) and Basic/Comp/PseudoGene/PolyA subtracks; no
regression on the non-quickLifted hg38 click.
- lines changed 126, context: html, text, full: html, text
7120a354ec4870b3bc2112c384bdb0af837ed44e Thu Apr 23 14:12:36 2026 -0700
hgc: make quickLifted NCBI RefSeq / UCSC RefSeq click details work. refs #36125
- trackHub.c isVetted(): accept refGene, ncbiRefSeq* subtracks, and ncbiOrtho
so the generated quickLift hub.txt doesn't emit 'avoidHandler on' for them.
Without this, hgc skips findNameBasedHandler and the detail click falls
through to genericClickHandler, showing none of the rich RefSeq fields.
- hgc.c findNameBasedHandler: when the tdb has 'quickLifted on', strip the
hub_NNN_ prefix from the dispatch table before comparisons so refGene /
ncbiRefSeq* match their native handlers.
- doNcbiRefSeq / doRefGene: route hAllocConn to quickLiftDb (source), use
trackHubSkipHubName for track/table name comparisons, and pass srcDb into
replaceInUrl, hTableExists, AceView / MGIid / jaxOrtholog / hg-prefix
checks -- otherwise these paths errAborted with "Unknown database
hub_NNN_<db>" on a quickLifted click.
- printRefSeqInfo / prRefGeneInfo / gbCdnaGetVersion: use sqlGetDatabase(conn)
rather than the global `database` for host-db checks.
- Predicted Protein / Predicted mRNA links for quickLifted tracks now point
at htcTranslatedPredMRna / htcGeneMrna so the sequences come from the
destination genome at the lifted exon coordinates, rather than the
NCBI-authored refPep / seqNcbiRefSeq extFiles on the source. Native
behavior is unchanged.
- showGenePos: fetch via quickLiftSql + calcLiftOverGenePreds when the track
is quickLifted so the Position line is in destination coordinates. The
swapped chainHash comes from the track's quickLiftUrl bigChain, matching
the pattern getGenePredForPositionSql already uses.
- Suppress the mRNA/Genomic Alignments block and its trailing <hr> for
quickLifted refGene / ncbiRefSeq* clicks. The PSLs come back in source
coords and the htcCdnaAli links don't line up with the destination window.
- getAlignmentsTName: use sqlGetDatabase(conn) rather than global `database`
so split-table resolution looks up the right db.
- printGeneCards: take a db argument; callers pass srcDb so the GeneCards
link renders on quickLifted pages (the "startsWith hg" check was matching
hub_NNN_hs1 as false).
- geneReviewsClick.c prGRShortRefGene: take a struct sqlConnection * rather
than opening its own on `database` (hub_NNN_hs1 -> Unknown database abort).
Callers in prRefGeneInfo and doOmimGene2 pass their existing source-db
conn, so the Related GeneReviews line now shows on both native and
quickLifted refGene pages.
Verified end-to-end on hgwdev-braney with hg38 -> hs1 quickLift of the
refSeqComposite: ncbiRefSeqCurated and refGene clicks on SHH now render
the full doNcbiRefSeq / doRefGene details (RefSeq / Status / Description
/ Synonyms / OMIM / Protein / HGNC / Entrez Gene / GeneCards / AceView /
Summary / Position / Gene Symbol); Predicted mRNA and Predicted Protein
links return destination-derived sequences; Genomic Sequence reaches the
Get-DNA-in-window page; Ctrl/Cmd+drag zoom no longer ends up with an
HTML error page inside the zoom dialog. Non-quickLifted hg38 refSeq
details are unchanged (alignments still shown, GeneReviews still present).
The duplicate-subtracks symptom from the ticket's step (i) is not
addressed here -- it's an architectural consequence of the shared
`refSeqComposite` cart key applying to both the destination's native
composite and the quickLift hub's composite, and needs cart-scoping
work in quickLift v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/hgc/hgc.h
- lines changed 1, context: html, text, full: html, text
7120a354ec4870b3bc2112c384bdb0af837ed44e Thu Apr 23 14:12:36 2026 -0700
hgc: make quickLifted NCBI RefSeq / UCSC RefSeq click details work. refs #36125
- trackHub.c isVetted(): accept refGene, ncbiRefSeq* subtracks, and ncbiOrtho
so the generated quickLift hub.txt doesn't emit 'avoidHandler on' for them.
Without this, hgc skips findNameBasedHandler and the detail click falls
through to genericClickHandler, showing none of the rich RefSeq fields.
- hgc.c findNameBasedHandler: when the tdb has 'quickLifted on', strip the
hub_NNN_ prefix from the dispatch table before comparisons so refGene /
ncbiRefSeq* match their native handlers.
- doNcbiRefSeq / doRefGene: route hAllocConn to quickLiftDb (source), use
trackHubSkipHubName for track/table name comparisons, and pass srcDb into
replaceInUrl, hTableExists, AceView / MGIid / jaxOrtholog / hg-prefix
checks -- otherwise these paths errAborted with "Unknown database
hub_NNN_<db>" on a quickLifted click.
- printRefSeqInfo / prRefGeneInfo / gbCdnaGetVersion: use sqlGetDatabase(conn)
rather than the global `database` for host-db checks.
- Predicted Protein / Predicted mRNA links for quickLifted tracks now point
at htcTranslatedPredMRna / htcGeneMrna so the sequences come from the
destination genome at the lifted exon coordinates, rather than the
NCBI-authored refPep / seqNcbiRefSeq extFiles on the source. Native
behavior is unchanged.
- showGenePos: fetch via quickLiftSql + calcLiftOverGenePreds when the track
is quickLifted so the Position line is in destination coordinates. The
swapped chainHash comes from the track's quickLiftUrl bigChain, matching
the pattern getGenePredForPositionSql already uses.
- Suppress the mRNA/Genomic Alignments block and its trailing <hr> for
quickLifted refGene / ncbiRefSeq* clicks. The PSLs come back in source
coords and the htcCdnaAli links don't line up with the destination window.
- getAlignmentsTName: use sqlGetDatabase(conn) rather than global `database`
so split-table resolution looks up the right db.
- printGeneCards: take a db argument; callers pass srcDb so the GeneCards
link renders on quickLifted pages (the "startsWith hg" check was matching
hub_NNN_hs1 as false).
- geneReviewsClick.c prGRShortRefGene: take a struct sqlConnection * rather
than opening its own on `database` (hub_NNN_hs1 -> Unknown database abort).
Callers in prRefGeneInfo and doOmimGene2 pass their existing source-db
conn, so the Related GeneReviews line now shows on both native and
quickLifted refGene pages.
Verified end-to-end on hgwdev-braney with hg38 -> hs1 quickLift of the
refSeqComposite: ncbiRefSeqCurated and refGene clicks on SHH now render
the full doNcbiRefSeq / doRefGene details (RefSeq / Status / Description
/ Synonyms / OMIM / Protein / HGNC / Entrez Gene / GeneCards / AceView /
Summary / Position / Gene Symbol); Predicted mRNA and Predicted Protein
links return destination-derived sequences; Genomic Sequence reaches the
Get-DNA-in-window page; Ctrl/Cmd+drag zoom no longer ends up with an
HTML error page inside the zoom dialog. Non-quickLifted hg38 refSeq
details are unchanged (alignments still shown, GeneReviews still present).
The duplicate-subtracks symptom from the ticket's step (i) is not
addressed here -- it's an architectural consequence of the shared
`refSeqComposite` cart key applying to both the destination's native
composite and the quickLift hub's composite, and needs cart-scoping
work in quickLift v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/htdocs/goldenPath/help/bigBed.html
- lines changed 2, context: html, text, full: html, text
65091fe6f6487c23d650a144e947fc1c582d3f40 Tue Apr 21 02:16:16 2026 -0700
abelSv: move under lrSv supertrack as short-read comparison subtrack
Move the Abel et al. 2020 CCDG 17,795-genome SV callset from a
top-level hg38 track to a subtrack of the lrSv supertrack (parallel
to onekg3202Sr) and relabel shortLabel/longLabel to flag Illumina
short-read provenance. The same bigBed is now visible on hg38 in
the long-read SV browsing context. Also:
- Clarify abelSv.html variant counts: 738,624 upstream unique SVs
across both callsets, 737,998 after B37->hg38 liftOver (626
unmapped). B38=458,106, B37lift=279,892.
- lrSv.html: fix triple-slash https:/// in the Ebert et al. Science
reference URL.
- bigBed.html: add closing </li> on the extra-fields pipe-separator
bullet and tighten a comma in the same sentence.
refs #36258, refs #37376
- src/hg/htdocs/goldenPath/help/bigMaf.html
- lines changed 4, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/htdocs/goldenPath/help/facetedComposite.html
- lines changed 250, context: html, text, full: html, text
4a6170904fe3901af94b7cf0494e9e991e40115e Wed Apr 22 03:39:05 2026 -0700
First pass at a faceted composite help doc, refs #36320
- src/hg/htdocs/goldenPath/help/trackDb/changes.html
- lines changed 7, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- lines changed 1, context: html, text, full: html, text
a470786ea3c54bfcfac6192a0aaacdb23c02a819 Thu Apr 23 08:20:45 2026 -0700
Fixing small typo in the trackDb changes doc, refs #36320
- src/hg/htdocs/goldenPath/help/trackDb/trackDbDoc.css
- lines changed 12, context: html, text, full: html, text
75a1bac8b8addd47d90c5061d9167c581677be5b Thu Apr 23 08:41:25 2026 -0700
Fix for trackDb docs anchors landing under the javascript search box.
Now they're pushed down a bit below it (more on narrow displays). No ticket
- src/hg/htdocs/goldenPath/help/trackDb/trackDbDoc.html
- lines changed 3, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- src/hg/htdocs/goldenPath/help/trackDb/trackDbHub.v3.html
- lines changed 3, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- src/hg/htdocs/goldenPath/help/trackDb/trackDbLibrary.shtml
- lines changed 2, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- lines changed 7, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- lines changed 1, context: html, text, full: html, text
ac8d49f37257e983488864f0bb6712a9260e10c1 Fri Apr 24 14:05:10 2026 -0700
Fix incorrect skipFields syntax in trackDbLibrary.shtml to match the comma-separated form used in the example and parsed by hgc.c. refs #36203
- src/hg/htdocs/goldenPath/newsarch.html
- lines changed 35, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 2, context: html, text, full: html, text
d23d0116ff17b126a498c8d02bdef578d0ab1b53 Wed Apr 22 12:51:20 2026 -0700
Update NMD Escape newsarch entry to match shipped Rule 2 definition. refs #33737
Rule 2 is no longer the 'intronless transcript rule' after the round 4
gate refinement (single coding exon AND no 3'UTR intron). Updated the
newsarch entry to match.
- lines changed 16, context: html, text, full: html, text
395a8efc6994c18a3b0bdfcee82217ff9d78b739 Wed Apr 22 12:54:59 2026 -0700
Expand NMD Escape newsarch rules into sub-bullets. refs #33737
Break the four-rule summary into individual sub-bullets under the
ruleset line so each rule is visible at a glance.
- src/hg/htdocs/inc/hgMyData.html
- lines changed 4, context: html, text, full: html, text
4f03efa12fa7a52cad6b78f24d295ff5d80405c0 Thu Apr 23 12:14:00 2026 -0700
Try to make it more obvious that clicking the 'view' button next to a track file in hubspace connects the whole hub. Add a banner above the table indicating this and with a link that connects the entire hub, refs Max/Baihe email
- src/hg/htdocs/indexNews.html
- lines changed 12, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- src/hg/htdocs/style/hgMyData.css
- lines changed 8, context: html, text, full: html, text
d40ace87860e49440a8ffcd09a4a68a682ad07ec Thu Apr 23 12:57:12 2026 -0700
add nowrap rules to more settings in the hubSpace data table to prevent row heights from growing when a filename is too long for the current window size and forces an adjustment by Data Tables. adjust columns when the window size grows dynamically. this should ensure the view in buttons are always the same size regardless of screen width or table content, refs Max/Baihe email
- src/hg/htdocs/style/makefile
- lines changed 1, context: html, text, full: html, text
7d049679cd9efc4dacc25fc0f181e85d7221d48b Tue Apr 21 12:44:36 2026 -0700
adding liftRequest.js refs #31811
- src/hg/hubApi/liftOver.c
- lines changed 1, context: html, text, full: html, text
908c2b249f2cea98192862b2ce49af1a43e48838 Tue Apr 21 12:37:24 2026 -0700
updated ottoRequest table schema for more general requests refs #31811
- lines changed 6, context: html, text, full: html, text
83f8234095a0b5e6edb8aaaeabb8e78443830ea3 Thu Apr 23 13:48:43 2026 -0700
turn off the email notification from here, let the otto cron job do the email refs #31811
- lines changed 1, context: html, text, full: html, text
b3a76833aa84a341e86d0bacef2ff1f6b9b44851 Thu Apr 23 14:30:24 2026 -0700
rename workflowId column to buildDir and expand its size refs #31811
- lines changed 1, context: html, text, full: html, text
1cbaa9ddcb7a13929a2217db78d6b36ff738fe15 Fri Apr 24 10:52:34 2026 -0700
doneStatus can simply be status refs #31811
- src/hg/inc/hubSpace.h
- lines changed 2, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/inc/ottoRequest.h
- lines changed 5, context: html, text, full: html, text
435e92d0ad49164d1b5b156e3fb5ef5ff7b19041 Tue Apr 21 12:29:50 2026 -0700
make the ottoRequest table a bit more generic so it can handle both liftOver and assembly requests refs #31811
- lines changed 1, context: html, text, full: html, text
fcfb91470c294d3b991dbd0d24cba67de9ed65cb Tue Apr 21 14:20:57 2026 -0700
create ottoRequest cron watch script refs #31811
- lines changed 2, context: html, text, full: html, text
b3a76833aa84a341e86d0bacef2ff1f6b9b44851 Thu Apr 23 14:30:24 2026 -0700
rename workflowId column to buildDir and expand its size refs #31811
- lines changed 1, context: html, text, full: html, text
1cbaa9ddcb7a13929a2217db78d6b36ff738fe15 Fri Apr 24 10:52:34 2026 -0700
doneStatus can simply be status refs #31811
- src/hg/inc/quickLift.h
- lines changed 7, context: html, text, full: html, text
81d00f3eec6ea6978c9a71ed6a48c84a0bd0c987 Wed Apr 22 14:51:08 2026 -0700
hgFind: remap bigBed search hits from source to destination coords when the track is quickLifted. Previously a search for e.g. "BRCA2" on a quickLifted hub (hg38 tracks displayed on HG02257.pat) returned hits at hg38 chr13 coordinates; clicking the result errored with "Sorry, couldn't locate chr13:... in <dest>". Adds quickLiftLiftPos() in hg/lib/quickLift.c, which reads the source->dest liftOverChainFile and calls liftOverRemapRange. Called from bigBedIntervalListToHgPositions in hg/lib/bigBedFind.c whenever tdb has quickLiftUrl/quickLiftDb; hits that don't map through the chain are dropped. refs #36340
- src/hg/inc/trackHub.h
- lines changed 5, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/inc/userdata.h
- lines changed 31, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 2, context: html, text, full: html, text
ae1db8bd101ae572e4382748dc4b19b176449fa0 Thu Apr 23 09:22:05 2026 -0700
Fix hubSpace quota calculation to use quotax(1024x1024x1024) rather than quotax10^9, refs #37425
- src/hg/js/facetedComposite.js
- lines changed 33, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- lines changed 25, context: html, text, full: html, text
72de5a10dd13d893dd4111ac8acfdc29579760e8 Sat Apr 25 10:18:50 2026 -0700
Fixing a conflict in faceted composites between the use of primaryKey values as
data elements (for subtrack names) and the features around linking out (id|label stuff). refs #36320
- src/hg/js/hgMyData.js
- lines changed 455, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 36, context: html, text, full: html, text
0db69910b2561b37d888bdff69895117eea55175 Thu Apr 23 12:00:09 2026 -0700
Make file names in hubSpace links to view/download the files. Add a copy icon next to each file that copies the url for easy linking to hub.txts, refs Max/Baihe email
- lines changed 34, context: html, text, full: html, text
4f03efa12fa7a52cad6b78f24d295ff5d80405c0 Thu Apr 23 12:14:00 2026 -0700
Try to make it more obvious that clicking the 'view' button next to a track file in hubspace connects the whole hub. Add a banner above the table indicating this and with a link that connects the entire hub, refs Max/Baihe email
- lines changed 24, context: html, text, full: html, text
d40ace87860e49440a8ffcd09a4a68a682ad07ec Thu Apr 23 12:57:12 2026 -0700
add nowrap rules to more settings in the hubSpace data table to prevent row heights from growing when a filename is too long for the current window size and forces an adjustment by Data Tables. adjust columns when the window size grows dynamically. this should ensure the view in buttons are always the same size regardless of screen width or table content, refs Max/Baihe email
- src/hg/js/hgTracks.js
- lines changed 13, context: html, text, full: html, text
d5580b384522b44b2d374446d33304a1adf2e13a Mon Apr 20 14:05:06 2026 -0700
hgTracks: wrap setInHistory calls in try/catch, refs #37367
- src/hg/js/hui.js
- lines changed 33, context: html, text, full: html, text
8e279300b1726f355767b467bc699685e90d487b Tue Apr 21 06:27:16 2026 -0700
composite hgTrackUi: fix btn_minus_all to hide parent, remember last vis, refs #37182
Two fixes around composite visibility on hgTrackUi:
1) The global [-] button (btn_minus_all) unchecked all subtracks but left
the composite visibility dropdown unchanged. The earlier fix in
814472876ea covered only the matrix-corner [-] (_matSetMatrixCheckBoxes);
the global button goes through matSubCBsCheck, which is now updated to
the same pattern.
2) [-] / [+] and individual subCB clicks now remember the prior visibility
via a data-last-viz attribute on the dropdown. Hiding stashes the
current value; re-showing restores it (falling back to pack/dense if
nothing saved) instead of always jumping to pack.
Factored into two helpers in hui.js: hideCompositeSaveVis() and an
extended exposeAll(). subCfg.onUserCbChange (individual subCB/matCB
path) delegates to the same helpers so behavior is uniform.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/js/liftRequest.js
- lines changed 26, context: html, text, full: html, text
113937fa86799fcf930f93aade0355a698088982 Tue Apr 21 12:45:53 2026 -0700
fixups to avoid jshint errors refs #31811
- src/hg/js/makefile
- lines changed 1, context: html, text, full: html, text
3e3270b639bae3a1dee7a4a69ffda4a6298c81cf Tue Apr 21 12:44:10 2026 -0700
adding liftRequest.js refs #31811
- src/hg/js/subCfg.js
- lines changed 16, context: html, text, full: html, text
8e279300b1726f355767b467bc699685e90d487b Tue Apr 21 06:27:16 2026 -0700
composite hgTrackUi: fix btn_minus_all to hide parent, remember last vis, refs #37182
Two fixes around composite visibility on hgTrackUi:
1) The global [-] button (btn_minus_all) unchecked all subtracks but left
the composite visibility dropdown unchanged. The earlier fix in
814472876ea covered only the matrix-corner [-] (_matSetMatrixCheckBoxes);
the global button goes through matSubCBsCheck, which is now updated to
the same pattern.
2) [-] / [+] and individual subCB clicks now remember the prior visibility
via a data-last-viz attribute on the dropdown. Hiding stashes the
current value; re-showing restores it (falling back to pack/dense if
nothing saved) instead of always jumping to pack.
Factored into two helpers in hui.js: hideCompositeSaveVis() and an
extended exposeAll(). subCfg.onUserCbChange (individual subCB/matCB
path) delegates to the same helpers so behavior is uniform.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/js/utils.js
- lines changed 1, context: html, text, full: html, text
9776e92bfd35a04bcb3d1bcfc165213f249ed89e Tue Apr 21 12:03:30 2026 -0700
mdbSearch javascript was constructing faulty URLs for hgApi, breaking
then ENCODE track search (and probably file search too), refs #37421
- src/hg/lib/bigBedFind.c
- lines changed 27, context: html, text, full: html, text
81d00f3eec6ea6978c9a71ed6a48c84a0bd0c987 Wed Apr 22 14:51:08 2026 -0700
hgFind: remap bigBed search hits from source to destination coords when the track is quickLifted. Previously a search for e.g. "BRCA2" on a quickLifted hub (hg38 tracks displayed on HG02257.pat) returned hits at hg38 chr13 coordinates; clicking the result errored with "Sorry, couldn't locate chr13:... in <dest>". Adds quickLiftLiftPos() in hg/lib/quickLift.c, which reads the source->dest liftOverChainFile and calls liftOverRemapRange. Called from bigBedIntervalListToHgPositions in hg/lib/bigBedFind.c whenever tdb has quickLiftUrl/quickLiftDb; hits that don't map through the chain are dropped. refs #36340
- src/hg/lib/hubConnect.c
- lines changed 31, context: html, text, full: html, text
6d37e3250000dee12e193601c4cdbae21e4d099b Mon Apr 20 14:59:37 2026 -0700
make assumesHub session portability work for assembly hubs: when the current db is hub_<id>_<genome> and the hub's id differs from the local hubStatus id, remap the db cart variable and any db-keyed cart vars (e.g. position.<db>) along with the hub_<id>_* track settings that were already being renamed. refs #34986
- src/hg/lib/hubSpace.as
- lines changed 1, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/lib/hubSpace.c
- lines changed 33, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/lib/hubSpace.sql
- lines changed 1, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/lib/hubSpaceQuotas.c
- lines changed 3, context: html, text, full: html, text
ae1db8bd101ae572e4382748dc4b19b176449fa0 Thu Apr 23 09:22:05 2026 -0700
Fix hubSpace quota calculation to use quotax(1024x1024x1024) rather than quotax10^9, refs #37425
- src/hg/lib/hui.c
- lines changed 39, context: html, text, full: html, text
5aed2d465f12c3b82dbb6ae78b3bf5529fd0a2ba Mon Apr 20 17:13:12 2026 -0700
hgc/hgTrackUi: populate quickLift source track description when the source assembly is a hub or GenArk. getTrackHtml() had an empty TODO branch for trackHubDatabase(db) || isGenArk(db); wire it up to open the source hub (via the connected hub or genarkUrl), load its trackDb list, find the track by bare name, and call trackHubAddDescription to fill tdb->html. refs #37389
- lines changed 22, context: html, text, full: html, text
66ea6cb4eaf2036e464be55b295765ed5105a0fb Wed Apr 22 09:42:25 2026 -0700
hgTrackUi/hui: render filter UI on supertrack configuration pages
Supertracks (group of tracks with superTrack on, no data of their own)
previously had no way to expose a shared filter: their trackDb stanza
can declare filter.* / filterByRange.* / filterValues.*, but those
settings were never drawn on the supertrack's hgTrackUi page. So users
had to open each subtrack's own configuration page and set the same
filter there, and the "lrSv.filter.svLen" cart namespace went unused.
This change wires that up:
- hgTrackUi.c (superTrackUi): after listing the subtracks, if the
supertrack tdb declares any filter.* settings call scoreCfgUi() to
render the standard filter UI. Cart variables land under the
supertrack's own name (e.g. "lrSv.filter.svLen.min"), and subtracks
already inherit them through cartOptionalStringClosestToHome()
walking tdb->parent. Subtrack-level values continue to override.
- hui.c:
- buildFilterBy() / filterByValues(): tolerate a NULL autoSql object,
so supertracks (which have no data table) don't errAbort when they
declare filterValues.* of virtual aggregated fields. Missing-field
errAbort still fires in the normal subtrack case.
- scoreCfgUi() / cfgByCfgType(): when called with title == NULL (the
supertrack filter path), suppress the default "<p>" separator and
the "<BR>" between the title bar and the filter block; the caller
renders its own section heading.
- asForTdb(): handle conn == NULL by returning NULL rather than
crashing, since supertrack filter rendering has no associated
sqlConnection.
refs #37426
- src/hg/lib/joiner.c
- lines changed 11, context: html, text, full: html, text
abc896118b1b7aaf66ff3ade939f34edf49ac97a Thu Apr 23 13:56:57 2026 -0700
switch gencode all,joiner to use macros instead of growing every release #37436
- src/hg/lib/ottoRequest.as
- lines changed 4, context: html, text, full: html, text
435e92d0ad49164d1b5b156e3fb5ef5ff7b19041 Tue Apr 21 12:29:50 2026 -0700
make the ottoRequest table a bit more generic so it can handle both liftOver and assembly requests refs #31811
- lines changed 1, context: html, text, full: html, text
fcfb91470c294d3b991dbd0d24cba67de9ed65cb Tue Apr 21 14:20:57 2026 -0700
create ottoRequest cron watch script refs #31811
- lines changed 2, context: html, text, full: html, text
b3a76833aa84a341e86d0bacef2ff1f6b9b44851 Thu Apr 23 14:30:24 2026 -0700
rename workflowId column to buildDir and expand its size refs #31811
- lines changed 1, context: html, text, full: html, text
1cbaa9ddcb7a13929a2217db78d6b36ff738fe15 Fri Apr 24 10:52:34 2026 -0700
doneStatus can simply be status refs #31811
- src/hg/lib/ottoRequest.c
- lines changed 51, context: html, text, full: html, text
435e92d0ad49164d1b5b156e3fb5ef5ff7b19041 Tue Apr 21 12:29:50 2026 -0700
make the ottoRequest table a bit more generic so it can handle both liftOver and assembly requests refs #31811
- lines changed 9, context: html, text, full: html, text
b3a76833aa84a341e86d0bacef2ff1f6b9b44851 Thu Apr 23 14:30:24 2026 -0700
rename workflowId column to buildDir and expand its size refs #31811
- lines changed 8, context: html, text, full: html, text
1cbaa9ddcb7a13929a2217db78d6b36ff738fe15 Fri Apr 24 10:52:34 2026 -0700
doneStatus can simply be status refs #31811
- src/hg/lib/ottoRequest.sql
- lines changed 5, context: html, text, full: html, text
435e92d0ad49164d1b5b156e3fb5ef5ff7b19041 Tue Apr 21 12:29:50 2026 -0700
make the ottoRequest table a bit more generic so it can handle both liftOver and assembly requests refs #31811
- lines changed 1, context: html, text, full: html, text
fcfb91470c294d3b991dbd0d24cba67de9ed65cb Tue Apr 21 14:20:57 2026 -0700
create ottoRequest cron watch script refs #31811
- lines changed 2, context: html, text, full: html, text
b3a76833aa84a341e86d0bacef2ff1f6b9b44851 Thu Apr 23 14:30:24 2026 -0700
rename workflowId column to buildDir and expand its size refs #31811
- lines changed 2, context: html, text, full: html, text
1cbaa9ddcb7a13929a2217db78d6b36ff738fe15 Fri Apr 24 10:52:34 2026 -0700
doneStatus can simply be status refs #31811
- src/hg/lib/quickLift.c
- lines changed 36, context: html, text, full: html, text
81d00f3eec6ea6978c9a71ed6a48c84a0bd0c987 Wed Apr 22 14:51:08 2026 -0700
hgFind: remap bigBed search hits from source to destination coords when the track is quickLifted. Previously a search for e.g. "BRCA2" on a quickLifted hub (hg38 tracks displayed on HG02257.pat) returned hits at hg38 chr13 coordinates; clicking the result errored with "Sorry, couldn't locate chr13:... in <dest>". Adds quickLiftLiftPos() in hg/lib/quickLift.c, which reads the source->dest liftOverChainFile and calls liftOverRemapRange. Called from bigBedIntervalListToHgPositions in hg/lib/bigBedFind.c whenever tdb has quickLiftUrl/quickLiftDb; hits that don't map through the chain are dropped. refs #36340
- src/hg/lib/trackHub.c
- lines changed 19, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 2, context: html, text, full: html, text
8c8823d9a87b6046094577a808dda4ed4172b416 Thu Apr 23 11:09:48 2026 -0700
Make quickLifted GENCODE tracks usable: click details, position links, coloring. refs #36059
- Re-vet wgEncodeGencode* in trackHub.c:isVetted() so the quickLift hub
no longer emits `avoidHandler on` for GENCODE children. Without this,
hgc bypasses findNameBasedHandler and doGencodeGene is never reached,
so the details page falls through to the generic showGenePos output.
- In gencodeClick.c doGencodeGene/helpers, use quickLiftDb for the
MySQL connection (the attrs/tag/pubMed/refSeq/uniProt tables live in
the source assembly), use trackHubSkipHubName(tdb->track) for the
genePred table name and the Basic/Comp/PseudoGene/PolyA prefix
dispatch checks, and teach isGrcHuman/isGrcH37Native to consult the
source db — otherwise the page errAborted with
"BUG: gencodeClick on wrong database: hub_NNNNN_hs1".
- Lift the Position values (transcript + gene bounds, 2-way pseudo,
PolyA) through the quickLift chain before printing, so the details
page shows destination-assembly coords and the Position link lands
at a window that actually contains the gene.
- htcDnaNearGene (hgc.c): when quickLifted, lift seqName/winStart/winEnd
back to source coords before hgSeqItemsInRange. The prior code
pointed the query at the source-assembly table but still passed the
destination window, so the "Genomic Sequence from assembly" flow
returned "No results returned from query." No longer reachable from
the GENCODE details page (doGencodeGene writes its own Sequences
table) but still hit by other quickLifted genePred tracks.
- simpleTracks.c genePredItemClassColor: the hTableExists() guard was
checking `database` (the destination hub db like hub_NNNNN_hs1) even
though the MySQL conn was already routed to liftDb; the lookup was
skipped and every item fell back to tg->ixColor, rendering all
black. Check `db` instead so the gClass_* palette (coding / nonCoding
/ pseudo / problem) is applied.
Verified end-to-end on hgwdev-braney: Gerardo's SHH test case
(ENST00000297261.7) and Basic/Comp/PseudoGene/PolyA subtracks; no
regression on the non-quickLifted hg38 click.
- lines changed 6, context: html, text, full: html, text
7120a354ec4870b3bc2112c384bdb0af837ed44e Thu Apr 23 14:12:36 2026 -0700
hgc: make quickLifted NCBI RefSeq / UCSC RefSeq click details work. refs #36125
- trackHub.c isVetted(): accept refGene, ncbiRefSeq* subtracks, and ncbiOrtho
so the generated quickLift hub.txt doesn't emit 'avoidHandler on' for them.
Without this, hgc skips findNameBasedHandler and the detail click falls
through to genericClickHandler, showing none of the rich RefSeq fields.
- hgc.c findNameBasedHandler: when the tdb has 'quickLifted on', strip the
hub_NNN_ prefix from the dispatch table before comparisons so refGene /
ncbiRefSeq* match their native handlers.
- doNcbiRefSeq / doRefGene: route hAllocConn to quickLiftDb (source), use
trackHubSkipHubName for track/table name comparisons, and pass srcDb into
replaceInUrl, hTableExists, AceView / MGIid / jaxOrtholog / hg-prefix
checks -- otherwise these paths errAborted with "Unknown database
hub_NNN_<db>" on a quickLifted click.
- printRefSeqInfo / prRefGeneInfo / gbCdnaGetVersion: use sqlGetDatabase(conn)
rather than the global `database` for host-db checks.
- Predicted Protein / Predicted mRNA links for quickLifted tracks now point
at htcTranslatedPredMRna / htcGeneMrna so the sequences come from the
destination genome at the lifted exon coordinates, rather than the
NCBI-authored refPep / seqNcbiRefSeq extFiles on the source. Native
behavior is unchanged.
- showGenePos: fetch via quickLiftSql + calcLiftOverGenePreds when the track
is quickLifted so the Position line is in destination coordinates. The
swapped chainHash comes from the track's quickLiftUrl bigChain, matching
the pattern getGenePredForPositionSql already uses.
- Suppress the mRNA/Genomic Alignments block and its trailing <hr> for
quickLifted refGene / ncbiRefSeq* clicks. The PSLs come back in source
coords and the htcCdnaAli links don't line up with the destination window.
- getAlignmentsTName: use sqlGetDatabase(conn) rather than global `database`
so split-table resolution looks up the right db.
- printGeneCards: take a db argument; callers pass srcDb so the GeneCards
link renders on quickLifted pages (the "startsWith hg" check was matching
hub_NNN_hs1 as false).
- geneReviewsClick.c prGRShortRefGene: take a struct sqlConnection * rather
than opening its own on `database` (hub_NNN_hs1 -> Unknown database abort).
Callers in prRefGeneInfo and doOmimGene2 pass their existing source-db
conn, so the Related GeneReviews line now shows on both native and
quickLifted refGene pages.
Verified end-to-end on hgwdev-braney with hg38 -> hs1 quickLift of the
refSeqComposite: ncbiRefSeqCurated and refGene clicks on SHH now render
the full doNcbiRefSeq / doRefGene details (RefSeq / Status / Description
/ Synonyms / OMIM / Protein / HGNC / Entrez Gene / GeneCards / AceView /
Summary / Position / Gene Symbol); Predicted mRNA and Predicted Protein
links return destination-derived sequences; Genomic Sequence reaches the
Get-DNA-in-window page; Ctrl/Cmd+drag zoom no longer ends up with an
HTML error page inside the zoom dialog. Non-quickLifted hg38 refSeq
details are unchanged (alignments still shown, GeneReviews still present).
The duplicate-subtracks symptom from the ticket's step (i) is not
addressed here -- it's an architectural consequence of the shared
`refSeqComposite` cart key applying to both the destination's native
composite and the quickLift hub's composite, and needs cart-scoping
work in quickLift v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/lib/userdata.c
- lines changed 295, context: html, text, full: html, text
404d5bb6d8c0418d5f06535ef470e36c35d2a237 Thu Apr 16 15:57:56 2026 -0700
Add assembly hub support to hubSpace.
Users can upload a .2bit to create an assembly hub, optionally alongside
their own *.hub.txt (prefix names like araTha1.hub.txt are recognized)
and sibling track files. Uploads run in parallel; hub.txt mutations are
serialized per-hub via flock so arrival order does not matter.
- hubSpace table gains a hubType column ('trackHub' or 'assemblyHub');
ON DUPLICATE KEY UPDATE excludes it so a re-upload cannot revert an
upgraded hub.
- writeHubText can now emit an assembly stanza derived from the 2bit;
upgradeHubTxtForAssembly promotes an existing plain hub.txt in place
when a 2bit arrives after tracks.
- pre-finish decides synthesize vs upgrade vs leave-alone from server
state (existing rows, hub.txt on disk) plus a single client flag
(batchHasHubTxt); client-supplied hubType is no longer trusted.
- Client UI adds 2bit as a file type, locks the genome field when the
hub is authoritative (drilled-in or batch hub.txt), defaults new
uploads to an existing assembly hub at top level, and routes
hgTracks URLs through 'genome=' vs 'db=' by hubType.
- Fix pre-existing nested-path bug in hubPathFromParentDir
(*firstSlash = 0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/logCrawl/parseQuickLiftLogs/parseQuickLiftLogs
- lines changed 211, context: html, text, full: html, text
bd2a9f1bb1cd2b0f8881a4ab4e5385455e8e69aa Thu Apr 23 11:26:06 2026 -0700
add parseQuickLiftLogs to kent/src
- src/hg/makeDb/doc/asmHubs/lastzRuns.txt
- lines changed 1428, context: html, text, full: html, text
7d1bd72b73f54dbacbb39445fbb6a1a3b0227433 Tue Apr 21 13:55:18 2026 -0700
Adding liftOver runs for liftOver files a user requested, refs #37380
- src/hg/makeDb/doc/bacteriaAsmHub/bacteria.orderList.tsv
- lines changed 18, context: html, text, full: html, text
597461b5be887ae783de796a596034cc2d946448 Mon Apr 20 16:09:41 2026 -0700
adding assemblies per user requests
- src/hg/makeDb/doc/danRer11/choriCloneEnds.txt
- lines changed 36, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/doc/danRer11/ncbiCloneEndsCH1073.txt
- lines changed 34, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 34, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/doc/fishAsmHub/fish.orderList.tsv
- lines changed 2, context: html, text, full: html, text
597461b5be887ae783de796a596034cc2d946448 Mon Apr 20 16:09:41 2026 -0700
adding assemblies per user requests
- src/hg/makeDb/doc/hg19.gencode.txt
- lines changed 6, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/doc/hg38/abelSv.txt
- lines changed 37, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/doc/hg38/gencode.txt
- lines changed 13, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/doc/hg38/gnomadMpc.txt
- lines changed 46, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/doc/hg38/lrSv.txt
- lines changed 20, context: html, text, full: html, text
5e4ca58df1b5bfe554fe5cc3309a39736ca256ee Tue Apr 21 08:08:52 2026 -0700
cpc1Sv: restrict to the 58 CPC samples, drop HPRC-specific SVs
Rewrite lrSvCpc1VcfToBed.py to identify the 58 CPC sample columns by
name prefix (HIFI032* or RY*), recompute AC/AN/NS from those GT
columns only, and skip any snarl that no CPC sample carries. The
HPRC portion is already represented elsewhere in lrSv, so this keeps
the track population-consistent with its label.
Rebuild results: 46,092 snarl sites on hs1 (down from 97,205 when
combined with HPRC), 36,030 lifted to hg38 (down from 81,261;
10,062 unmapped). Updates cpc1Sv.html, lrSv.ra labels, and the
makeDoc.
refs #36258
- lines changed 3, context: html, text, full: html, text
8a5a466f5e13a020954014cdefc81400072db516 Tue Apr 21 08:29:55 2026 -0700
lrSv: add hprc2 hs1 subtrack using T2T-CHM13 wave VCF, refs #36258
The HPRC release-2 pangenome publishes a wave-decomposed VCF against
both GRCh38 and T2T-CHM13. We already had the GRCh38 version as the
hprc2Sv subtrack on hg38; this adds the parallel T2T-CHM13 build under
/gbdb/hs1/lrSv/hprc2.bb. The existing trackDb stanza (bigDataUrl
/gbdb/$D/lrSv/hprc2.bb) picks it up on hs1 without changes.
1,451,269 SV rows kept (937,425 INS, 360,960 DEL, 147,898 COMPLEX,
4,986 INV) using the existing lrSvHprc2VcfToBed.py converter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 26, context: html, text, full: html, text
06a482a2120d4d85c7c34fb5038213e07f595554 Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 21, context: html, text, full: html, text
1732661494ece5e645a9522f15a0f5922b035d1a Wed Apr 22 08:57:11 2026 -0700
colorsDbSv: rebuild from pbsv+Jasmine source VCFs with richer AS
Rebuild the CoLoRSdb SV bigBeds for hg38 and hs1 from the upstream
pbsv+Jasmine VCFs that the CoLoRSdb project distributes directly.
The previous bigBed stored AF as a string (breaking the numeric
filter) and lacked insLen (causing a "filter on field insLen not in
AS file" error under the supertrack-level filter). The new build:
- stores AF as a float
- adds a derived insLen column (alt-ref length delta for INS, 0
otherwise) so the shared lrSv insLen filter applies
- keeps every INFO field from the source (SVTYPE, SVLEN, END, AC,
AN, NS, AC_Hom, AC_Het, AC_Hemi, AF, HWE, ExcHet, nhomalt) plus
REF/ALT
- uses the canonical svName(TYPE, featLen, AC) label via lrSvCommon
Record counts match the source VCFs: 426,239 on hg38 (59 MB) and
839,714 on hs1 (87 MB). /gbdb symlinks unchanged. The trackDb
colorsDbSv stanza is updated to reference the new AS field names
(acHom/acHet/acHemi, AF, AN) and to add the insLen filter. Also
fixes a nearby `version 1.1` -> `dataVersion 1.1` typo in
lrSv1kgOnt that was failing the tagTypes check.
refs #36258
- lines changed 48, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/doc/hg38/mpra.txt
- lines changed 93, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/doc/hg38/nmd.txt
- lines changed 3, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 22, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 14, context: html, text, full: html, text
34d2eee845f5f45e571d1e153c632683b8a93f75 Tue Apr 21 16:17:53 2026 -0700
Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737
Previously Rule 2 required exonCount==1 (truly intronless). This
overcorrected for single-CDS-exon transcripts whose only introns are in
the 5'UTR: biologically these have no EJC downstream of the stop codon
(5'UTR EJCs are cleared by the scanning 40S or sit upstream of the
terminating ribosome) and are NMD-immune, but the code pushed them to
Rules 1/3 under a less accurate "last coding exon" label.
New gate: len(cdsExons) == 1 AND no exon-exon junction strictly
downstream of the stop codon (strand-aware). Transcripts with a single
coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that
intron deposits an EJC that can trigger NMD.
3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule
2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR
introns correctly remain in Rules 1/3. Description page and makedoc
updated.
- lines changed 51, context: html, text, full: html, text
0151d00a4a1d73a78c35f6158c6c936ff338faeb Fri Apr 24 10:37:34 2026 -0700
NMD Escape: MANE subtrack, Rule 1 bug fix, transcript filter. refs #33737
- Add nmdEscMane subtrack (MANE Select Plus Clinical 1.5), built from
/gbdb/hg38/mane/mane.bb. Reuses nmdEscTranscripts.html.
- Fix Rule 1: measure 50 bp upstream of the transcript's last splice
junction (including 3'UTR introns) rather than stripping 3'UTR from
the exon list first. The old logic painted the entire last CDS exon
as NMD-escape whenever the transcript had only one CDS exon, even
when a 3'UTR intron sat far past the stop codon (e.g. NBDY: 207 bp
of CDS over-painted for a junction 2.6 kb past the stop).
- Add --rule1-mode {cds,mrna} (default cds): cds counts only CDS bp
on the walk-back (paints up to 50 bp of CDS matching the rule label
literally); mrna counts mRNA bp and clips to CDS (tracks the 55 bp
rule literature). Documented in makeDoc.
- Rule 4: when a 3'UTR intron exists, the last CDS-containing exon
has a downstream EJC and is now eligible for the long-exon rule.
- Mouseover lists contributing transcript accessions when 1-3 items
collapse into a region; falls back to a count above that.
- Add filterText/filterType/filterLabel on all three escape subtracks
so a user can narrow the display to one transcript.
- genePredNmdEsc: --gene-sym-field (default 17 for Gencode; pass 18
for MANE, whose HGNC symbol lives in bigGenePred geneName2).
- Add findShortTxLongUtrIntron.py helper for finding MANE transcripts
with long UTR introns (used to pick NMD edge-case test cases).
Post-fix collapsed-region counts (--rule1-mode=cds):
MANE 1.5: 67,752
Gencode V49: 233,375
RefSeq Curated: 112,356
- lines changed 12, context: html, text, full: html, text
3a62ea7e9a8cb3503586a0a78570331308c9bc58 Mon Apr 27 02:23:00 2026 -0700
NMD Escape MANE: expose NM_ accession via labelFields. refs #33737
Per QA, the MANE subtrack now shows the NCBI RefSeq accession by default
instead of the HGNC gene symbol, with the ENST and gene symbol still
selectable via labelFields.
- genePredNmdEsc: new --ncbi-id-field N option (default -1 = unused).
When set, the named bigGenePred column is captured per-transcript and
written into a new ncbiIds output column. For MANE pass 21.
- genePredNmdEsc: new --no-collapse option. By default, regions with
identical (chrom, start, end, rule) from multiple transcripts collapse
into one row with comma-separated lists. With --no-collapse the script
emits one row per (transcript, region). Used for MANE so each
label-field column holds a single value: the 74 MANE Plus Clinical
genes (e.g. LMNA) get two rows per region instead of one row with a
two-element list.
- nmdEscCollapsed.as: add lstring ncbiIds column. Schema is now bed9+3.
- nmd.ra (nmdEscMane only): labelFields ncbiIds,name,transcripts;
defaultLabelFields ncbiIds; labelSeparator " / ". Gencode and RefSeq
subtracks unchanged - they default to the gene symbol (name column)
and have an empty ncbiIds column.
- doc/hg38/nmd.txt: bump all three bedToBigBed invocations to bed9+3
and document the --ncbi-id-field 21 + --no-collapse invocation for
MANE.
Counts: MANE 68,028 (--no-collapse); Gencode 233,375; RefSeq 112,356.
- src/hg/makeDb/doc/hg38/promoterAi.txt
- lines changed 19, context: html, text, full: html, text
f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
QA fixes for PromoterAI track. refs #37278
Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D
paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID
40440429), corrected the score-direction wording (negative = under-expression,
positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access
source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover
blurb to match mouseOverFunction noAverage behavior.
Converter and AS: the overlap bigBed now carries the real per-transcript strand
from the source TSV (was hardcoded '+'), with a new strands column in the AS, and
the name field concatenates unique gene symbols so bidirectional-promoter items
read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is
meaningful. Rewrote the converter to stream (sorted input), which drops peak
memory from ~40 GB to a few MB.
trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable
without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the
bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw
red (over-expression) above zero and blue (under-expression) below, matching
the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap
subtrack.
Makedoc updated to describe the streaming pipeline, the new strands column,
and the rebuild workflow.
- lines changed 11, context: html, text, full: html, text
6c567fd9a03e87610681a43d2183ebb43547d1ad Fri Apr 24 17:58:57 2026 -0700
PromoterAI: review followups. refs #37278
Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the
underscore-prefix exclusion rule for hgdownload sync (same pattern as
PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated.
Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track
default of 20 is too cramped for a signed -1..+1 score.
Description page: drop the wrong primateai3d.basespace.illumina.com link
in Data Access; PromoterAI is not on BaseSpace, it's distributed via the
license agreement on the GitHub page (a download link is emailed after
submission). Reword Data Access and Methods accordingly.
Description page: add Illumina's recommended interpretation thresholds
(|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with
a note that higher cutoffs select smaller, higher-confidence sets.
- src/hg/makeDb/doc/hg38/srSv.txt
- lines changed 99, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/doc/hg38/varFreqs.txt
- lines changed 25, context: html, text, full: html, text
366afa4a74c46ec6fb2b667a2902a873feec40cf Mon Apr 20 23:00:05 2026 -0700
varFreqsAll: rebuild combined bigBed to include GA4K and CoLoRSdb
Regenerate the All Databases Combined track with the two long-read
PacBio subtracks (GA4K 552 samples and CoLoRSdb v1.2.0 1,027 samples)
that were added to varFreqs since the March build. Source count rises
from 21 to 23 databases; final bigBed is 37.7 GB with 1.17B records
and 113 fields. Updates varFreqs.ra filterValues.sources and per-
database AF/AC filters for the two new sources, and databases.tsv
+ varFreqs.txt (build notes).
refs #36642
- lines changed 23, context: html, text, full: html, text
695f40f9d6139a4df393522c067f1702aff8d3bd Wed Apr 22 03:13:39 2026 -0700
varFreqs: add SVatalog 101 short-read SNV frequencies subtrack
SNV/indel allele frequencies from the 101-sample GWAS SVatalog cohort
(Chirmade et al. 2026, Heredity, PMID 41203876), called from 10X
Genomics linked short-read WGS with GATK HaplotypeCaller v4.0.0.0 and
phased with SHAPEIT v4.2.0. Sibling of the lrSv chirmade101Sv
structural-variant track, which is built from the same 101 samples.
8,814,835 autosomal + chrX sites. Source release ships only AF; AC and
AN are synthesized in the emitted VCF as AC=round(AF*202) and AN=202
(2*101 diploid), with the gnomAD v3.1 non-Finnish European AF and dbSNP
rsID passed through as GNOMAD_NFE_AF and RSID info fields. VCF is
bgzipped + tabix-indexed (172 MB + 1.6 MB .tbi).
Files:
- scripts/varFreqs/svatalogFreqToVcf.py (new): per-chrom allele-freq
TSV -> single VCF with hg38 ##contig header
- trackDb/human/varFreqs.ra: new svatalogSnv vcfTabix subtrack
- trackDb/human/svatalogSnv.html (new): doc page
- trackDb/human/varFreqs.html: new row in Available Datasets table
- doc/hg38/varFreqs.txt: wget-free build block (input files were
downloaded manually from Zenodo 13367574)
Note: the All Databases Combined varFreqs bigBed has NOT been rebuilt
to include this new source yet; a subsequent merge pass will add it.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/doc/hs1/lrSv.txt
- lines changed 30, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 21, context: html, text, full: html, text
8a5a466f5e13a020954014cdefc81400072db516 Tue Apr 21 08:29:55 2026 -0700
lrSv: add hprc2 hs1 subtrack using T2T-CHM13 wave VCF, refs #36258
The HPRC release-2 pangenome publishes a wave-decomposed VCF against
both GRCh38 and T2T-CHM13. We already had the GRCh38 version as the
hprc2Sv subtrack on hg38; this adds the parallel T2T-CHM13 build under
/gbdb/hs1/lrSv/hprc2.bb. The existing trackDb stanza (bigDataUrl
/gbdb/$D/lrSv/hprc2.bb) picks it up on hs1 without changes.
1,451,269 SV rows kept (937,425 INS, 360,960 DEL, 147,898 COMPLEX,
4,986 INV) using the existing lrSvHprc2VcfToBed.py converter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/doc/invertebrateAsmHub/invertebrate.orderList.tsv
- lines changed 1, context: html, text, full: html, text
597461b5be887ae783de796a596034cc2d946448 Mon Apr 20 16:09:41 2026 -0700
adding assemblies per user requests
- src/hg/makeDb/doc/mm39/gencode.txt
- lines changed 6, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/doc/plantsAsmHub/plants.orderList.tsv
- lines changed 1, context: html, text, full: html, text
1a9024a64f170308fb0d1adcb548399f960e46c2 Mon Apr 20 23:51:10 2026 -0700
NCBI changed the name of GCA_030272585.1
- src/hg/makeDb/doc/vgp577way/577.acc.sciName.comName.clade.tsv
- lines changed 577, context: html, text, full: html, text
e8b16ae601a09359f4b4fa543d365f3d02bd6e6d Wed Apr 22 13:25:09 2026 -0700
listing of assemblies with names and clade groupings for this 577-way alignment refs #34370
- src/hg/makeDb/doc/vgp577way/linkSizes.sh
- lines changed 54, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- src/hg/makeDb/doc/vgp577way/mafCoverage.pl
- lines changed 185, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- src/hg/makeDb/doc/vgp577way/mkIRowsJL.sh
- lines changed 17, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- src/hg/makeDb/doc/vgp577way/mkNbeds.sh
- lines changed 59, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- src/hg/makeDb/doc/vgp577way/queryCounts.sh
- lines changed 30, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- src/hg/makeDb/doc/vgp577way/sumBlocks.py
- lines changed 29, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- src/hg/makeDb/doc/vgp577way/vgp577way.txt
- lines changed 366, context: html, text, full: html, text
633cc56e1b1043b0ab6e1fd0e9b90154e2c03c0c Mon Apr 20 13:00:51 2026 -0700
document procedure for import of VGP 577-way maf result refs #34370
- lines changed 185, context: html, text, full: html, text
a83601f74d58302315fa0e3c151ff29df6637afb Wed Apr 22 14:39:15 2026 -0700
more general procedure refs #34370
- src/hg/makeDb/outside/gencode/bin/buildGencodeToUcscLift
- lines changed 1, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeBackMapMetadataIds
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeBuildRelease
- lines changed 274, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeExonSupportToTable
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeGenerateTrackDbs
- lines changed 128, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeGxfToAttrs
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeGxfToGenePred
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeJoinerCheck
- lines changed 28, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeMakeAttrs
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodeMakeTracks
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/bin/gencodePolyaGxfToGenePred
- lines changed 2, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/gencodeLoad.mk
- lines changed 80, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/outside/gencode/lib/gencode/gencodeTags.py
- lines changed 3, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/schema/all.joiner
- lines changed 8257, context: html, text, full: html, text
abc896118b1b7aaf66ff3ade939f34edf49ac97a Thu Apr 23 13:56:57 2026 -0700
switch gencode all,joiner to use macros instead of growing every release #37436
- lines changed 23, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/scripts/choriCloneEnds/cloneEnds.as
- lines changed 2, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/choriCloneEnds/makeBed.py
- lines changed 0, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/choriCloneEnds/refSeqNames.py
- lines changed 0, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/gnomadMpc/gnomadMpc.as
- lines changed 19, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/gnomadMpc/gnomadMpcToBed.py
- lines changed 97, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/gnomadMpc/gnomadMpcToWig.py
- lines changed 86, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/lrSv/lrSv.as
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSv1kg3202Sr.as
- lines changed 30, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/lrSv/lrSv1kg3202SrVcfToBed.py
- lines changed 126, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/lrSv/lrSv1kgOnt.as
- lines changed 5, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSv1kgOntVcfToBed.py
- lines changed 42, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvAou1k.as
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvAou1kCsvToBed.py
- lines changed 30, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvApr.as
- lines changed 5, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvAprVcfToBed.py
- lines changed 28, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvChirmade101.as
- lines changed 4, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvChirmade101TsvToBed.py
- lines changed 27, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvColorsDbSv.as
- lines changed 28, context: html, text, full: html, text
1732661494ece5e645a9522f15a0f5922b035d1a Wed Apr 22 08:57:11 2026 -0700
colorsDbSv: rebuild from pbsv+Jasmine source VCFs with richer AS
Rebuild the CoLoRSdb SV bigBeds for hg38 and hs1 from the upstream
pbsv+Jasmine VCFs that the CoLoRSdb project distributes directly.
The previous bigBed stored AF as a string (breaking the numeric
filter) and lacked insLen (causing a "filter on field insLen not in
AS file" error under the supertrack-level filter). The new build:
- stores AF as a float
- adds a derived insLen column (alt-ref length delta for INS, 0
otherwise) so the shared lrSv insLen filter applies
- keeps every INFO field from the source (SVTYPE, SVLEN, END, AC,
AN, NS, AC_Hom, AC_Het, AC_Hemi, AF, HWE, ExcHet, nhomalt) plus
REF/ALT
- uses the canonical svName(TYPE, featLen, AC) label via lrSvCommon
Record counts match the source VCFs: 426,239 on hg38 (59 MB) and
839,714 on hs1 (87 MB). /gbdb symlinks unchanged. The trackDb
colorsDbSv stanza is updated to reference the new AS field names
(acHom/acHet/acHemi, AF, AN) and to add the insLen filter. Also
fixes a nearby `version 1.1` -> `dataVersion 1.1` typo in
lrSv1kgOnt that was failing the tagTypes check.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvColorsDbSvBuild.sh
- lines changed 37, context: html, text, full: html, text
1732661494ece5e645a9522f15a0f5922b035d1a Wed Apr 22 08:57:11 2026 -0700
colorsDbSv: rebuild from pbsv+Jasmine source VCFs with richer AS
Rebuild the CoLoRSdb SV bigBeds for hg38 and hs1 from the upstream
pbsv+Jasmine VCFs that the CoLoRSdb project distributes directly.
The previous bigBed stored AF as a string (breaking the numeric
filter) and lacked insLen (causing a "filter on field insLen not in
AS file" error under the supertrack-level filter). The new build:
- stores AF as a float
- adds a derived insLen column (alt-ref length delta for INS, 0
otherwise) so the shared lrSv insLen filter applies
- keeps every INFO field from the source (SVTYPE, SVLEN, END, AC,
AN, NS, AC_Hom, AC_Het, AC_Hemi, AF, HWE, ExcHet, nhomalt) plus
REF/ALT
- uses the canonical svName(TYPE, featLen, AC) label via lrSvCommon
Record counts match the source VCFs: 426,239 on hg38 (59 MB) and
839,714 on hs1 (87 MB). /gbdb symlinks unchanged. The trackDb
colorsDbSv stanza is updated to reference the new AS field names
(acHom/acHet/acHemi, AF, AN) and to add the insLen filter. Also
fixes a nearby `version 1.1` -> `dataVersion 1.1` typo in
lrSv1kgOnt that was failing the tagTypes check.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvColorsDbSvVcfToBed.py
- lines changed 174, context: html, text, full: html, text
1732661494ece5e645a9522f15a0f5922b035d1a Wed Apr 22 08:57:11 2026 -0700
colorsDbSv: rebuild from pbsv+Jasmine source VCFs with richer AS
Rebuild the CoLoRSdb SV bigBeds for hg38 and hs1 from the upstream
pbsv+Jasmine VCFs that the CoLoRSdb project distributes directly.
The previous bigBed stored AF as a string (breaking the numeric
filter) and lacked insLen (causing a "filter on field insLen not in
AS file" error under the supertrack-level filter). The new build:
- stores AF as a float
- adds a derived insLen column (alt-ref length delta for INS, 0
otherwise) so the shared lrSv insLen filter applies
- keeps every INFO field from the source (SVTYPE, SVLEN, END, AC,
AN, NS, AC_Hom, AC_Het, AC_Hemi, AF, HWE, ExcHet, nhomalt) plus
REF/ALT
- uses the canonical svName(TYPE, featLen, AC) label via lrSvCommon
Record counts match the source VCFs: 426,239 on hg38 (59 MB) and
839,714 on hs1 (87 MB). /gbdb symlinks unchanged. The trackDb
colorsDbSv stanza is updated to reference the new AS field names
(acHom/acHet/acHemi, AF, AN) and to add the insLen filter. Also
fixes a nearby `version 1.1` -> `dataVersion 1.1` typo in
lrSv1kgOnt that was failing the tagTypes check.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvCommon.py
- lines changed 138, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvCpc1.as
- lines changed 5, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvCpc1VcfToBed.py
- lines changed 95, context: html, text, full: html, text
5e4ca58df1b5bfe554fe5cc3309a39736ca256ee Tue Apr 21 08:08:52 2026 -0700
cpc1Sv: restrict to the 58 CPC samples, drop HPRC-specific SVs
Rewrite lrSvCpc1VcfToBed.py to identify the 58 CPC sample columns by
name prefix (HIFI032* or RY*), recompute AC/AN/NS from those GT
columns only, and skip any snarl that no CPC sample carries. The
HPRC portion is already represented elsewhere in lrSv, so this keeps
the track population-consistent with its label.
Rebuild results: 46,092 snarl sites on hs1 (down from 97,205 when
combined with HPRC), 36,030 lifted to hg38 (down from 81,261;
10,062 unmapped). Updates cpc1Sv.html, lrSv.ra labels, and the
makeDoc.
refs #36258
- lines changed 23, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvDecode.as
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvDecodeVcfToBed.py
- lines changed 23, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvGa4kSv.as
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvGa4kSvVcfToBed.py
- lines changed 24, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvGustafson.as
- lines changed 19, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvGustafsonVcfToBed.py
- lines changed 114, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 27, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvHgsvc2.as
- lines changed 40, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvHgsvc2TsvToBed.py
- lines changed 163, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 18, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvHgsvc3.as
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvHgsvc3TsvToBed.py
- lines changed 22, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvHprc2.as
- lines changed 20, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 4, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvHprc2VcfToBed.py
- lines changed 180, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 33, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvKwanho.as
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvKwanhoTsvToBed.py
- lines changed 42, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvTommoJp.as
- lines changed 4, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py
- lines changed 95, context: html, text, full: html, text
06a482a2120d4d85c7c34fb5038213e07f595554 Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/lrSv/lrSvTommoJpVcfToBed.py
- lines changed 20, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/lrSv/lrSvVcfToBed.py
- lines changed 29, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/mpravardb/mpravardbToBed.py
- lines changed 53, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/scripts/ncbiCloneEndsCH1073/cloneEnds.as
- lines changed 17, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/ncbiCloneEndsCH1073/makeBed.py
- lines changed 125, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/ncbiCloneEndsCH1073/refSeqNames.py
- lines changed 33, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/scripts/nmd/findShortTxLongUtrIntron.py
- lines changed 98, context: html, text, full: html, text
0151d00a4a1d73a78c35f6158c6c936ff338faeb Fri Apr 24 10:37:34 2026 -0700
NMD Escape: MANE subtrack, Rule 1 bug fix, transcript filter. refs #33737
- Add nmdEscMane subtrack (MANE Select Plus Clinical 1.5), built from
/gbdb/hg38/mane/mane.bb. Reuses nmdEscTranscripts.html.
- Fix Rule 1: measure 50 bp upstream of the transcript's last splice
junction (including 3'UTR introns) rather than stripping 3'UTR from
the exon list first. The old logic painted the entire last CDS exon
as NMD-escape whenever the transcript had only one CDS exon, even
when a 3'UTR intron sat far past the stop codon (e.g. NBDY: 207 bp
of CDS over-painted for a junction 2.6 kb past the stop).
- Add --rule1-mode {cds,mrna} (default cds): cds counts only CDS bp
on the walk-back (paints up to 50 bp of CDS matching the rule label
literally); mrna counts mRNA bp and clips to CDS (tracks the 55 bp
rule literature). Documented in makeDoc.
- Rule 4: when a 3'UTR intron exists, the last CDS-containing exon
has a downstream EJC and is now eligible for the long-exon rule.
- Mouseover lists contributing transcript accessions when 1-3 items
collapse into a region; falls back to a count above that.
- Add filterText/filterType/filterLabel on all three escape subtracks
so a user can narrow the display to one transcript.
- genePredNmdEsc: --gene-sym-field (default 17 for Gencode; pass 18
for MANE, whose HGNC symbol lives in bigGenePred geneName2).
- Add findShortTxLongUtrIntron.py helper for finding MANE transcripts
with long UTR introns (used to pick NMD edge-case test cases).
Post-fix collapsed-region counts (--rule1-mode=cds):
MANE 1.5: 67,752
Gencode V49: 233,375
RefSeq Curated: 112,356
- src/hg/makeDb/scripts/nmd/genePredNmdEsc
- lines changed 16, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 2, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 13, context: html, text, full: html, text
34d2eee845f5f45e571d1e153c632683b8a93f75 Tue Apr 21 16:17:53 2026 -0700
Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737
Previously Rule 2 required exonCount==1 (truly intronless). This
overcorrected for single-CDS-exon transcripts whose only introns are in
the 5'UTR: biologically these have no EJC downstream of the stop codon
(5'UTR EJCs are cleared by the scanning 40S or sit upstream of the
terminating ribosome) and are NMD-immune, but the code pushed them to
Rules 1/3 under a less accurate "last coding exon" label.
New gate: len(cdsExons) == 1 AND no exon-exon junction strictly
downstream of the stop codon (strand-aware). Transcripts with a single
coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that
intron deposits an EJC that can trigger NMD.
3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule
2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR
introns correctly remain in Rules 1/3. Description page and makedoc
updated.
- lines changed 4, context: html, text, full: html, text
fe73446acf43f70e385dadbbb281634adf3cac9e Tue Apr 21 16:44:16 2026 -0700
NMD Escape QA tweaks: hide Gencode subtrack by default, bold rule numbers in mouseovers. refs #33737
- nmdEscGencode default visibility changed from on/dense to off/hide so
only the RefSeq Curated subtrack is on by default. Per Lou's request.
- RULE_DESCRIPTIONS mouseover strings wrap the rule number in <b>...</b>
so the rule shows bold in the tooltip. Both bigBeds rebuilt.
- lines changed 139, context: html, text, full: html, text
0151d00a4a1d73a78c35f6158c6c936ff338faeb Fri Apr 24 10:37:34 2026 -0700
NMD Escape: MANE subtrack, Rule 1 bug fix, transcript filter. refs #33737
- Add nmdEscMane subtrack (MANE Select Plus Clinical 1.5), built from
/gbdb/hg38/mane/mane.bb. Reuses nmdEscTranscripts.html.
- Fix Rule 1: measure 50 bp upstream of the transcript's last splice
junction (including 3'UTR introns) rather than stripping 3'UTR from
the exon list first. The old logic painted the entire last CDS exon
as NMD-escape whenever the transcript had only one CDS exon, even
when a 3'UTR intron sat far past the stop codon (e.g. NBDY: 207 bp
of CDS over-painted for a junction 2.6 kb past the stop).
- Add --rule1-mode {cds,mrna} (default cds): cds counts only CDS bp
on the walk-back (paints up to 50 bp of CDS matching the rule label
literally); mrna counts mRNA bp and clips to CDS (tracks the 55 bp
rule literature). Documented in makeDoc.
- Rule 4: when a 3'UTR intron exists, the last CDS-containing exon
has a downstream EJC and is now eligible for the long-exon rule.
- Mouseover lists contributing transcript accessions when 1-3 items
collapse into a region; falls back to a count above that.
- Add filterText/filterType/filterLabel on all three escape subtracks
so a user can narrow the display to one transcript.
- genePredNmdEsc: --gene-sym-field (default 17 for Gencode; pass 18
for MANE, whose HGNC symbol lives in bigGenePred geneName2).
- Add findShortTxLongUtrIntron.py helper for finding MANE transcripts
with long UTR introns (used to pick NMD edge-case test cases).
Post-fix collapsed-region counts (--rule1-mode=cds):
MANE 1.5: 67,752
Gencode V49: 233,375
RefSeq Curated: 112,356
- lines changed 29, context: html, text, full: html, text
3a62ea7e9a8cb3503586a0a78570331308c9bc58 Mon Apr 27 02:23:00 2026 -0700
NMD Escape MANE: expose NM_ accession via labelFields. refs #33737
Per QA, the MANE subtrack now shows the NCBI RefSeq accession by default
instead of the HGNC gene symbol, with the ENST and gene symbol still
selectable via labelFields.
- genePredNmdEsc: new --ncbi-id-field N option (default -1 = unused).
When set, the named bigGenePred column is captured per-transcript and
written into a new ncbiIds output column. For MANE pass 21.
- genePredNmdEsc: new --no-collapse option. By default, regions with
identical (chrom, start, end, rule) from multiple transcripts collapse
into one row with comma-separated lists. With --no-collapse the script
emits one row per (transcript, region). Used for MANE so each
label-field column holds a single value: the 74 MANE Plus Clinical
genes (e.g. LMNA) get two rows per region instead of one row with a
two-element list.
- nmdEscCollapsed.as: add lstring ncbiIds column. Schema is now bed9+3.
- nmd.ra (nmdEscMane only): labelFields ncbiIds,name,transcripts;
defaultLabelFields ncbiIds; labelSeparator " / ". Gencode and RefSeq
subtracks unchanged - they default to the gene symbol (name column)
and have an empty ncbiIds column.
- doc/hg38/nmd.txt: bump all three bedToBigBed invocations to bed9+3
and document the --ncbi-id-field 21 + --no-collapse invocation for
MANE.
Counts: MANE 68,028 (--no-collapse); Gencode 233,375; RefSeq 112,356.
- src/hg/makeDb/scripts/nmd/nmdEscCollapsed.as
- lines changed 2, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 1, context: html, text, full: html, text
3a62ea7e9a8cb3503586a0a78570331308c9bc58 Mon Apr 27 02:23:00 2026 -0700
NMD Escape MANE: expose NM_ accession via labelFields. refs #33737
Per QA, the MANE subtrack now shows the NCBI RefSeq accession by default
instead of the HGNC gene symbol, with the ENST and gene symbol still
selectable via labelFields.
- genePredNmdEsc: new --ncbi-id-field N option (default -1 = unused).
When set, the named bigGenePred column is captured per-transcript and
written into a new ncbiIds output column. For MANE pass 21.
- genePredNmdEsc: new --no-collapse option. By default, regions with
identical (chrom, start, end, rule) from multiple transcripts collapse
into one row with comma-separated lists. With --no-collapse the script
emits one row per (transcript, region). Used for MANE so each
label-field column holds a single value: the 74 MANE Plus Clinical
genes (e.g. LMNA) get two rows per region instead of one row with a
two-element list.
- nmdEscCollapsed.as: add lstring ncbiIds column. Schema is now bed9+3.
- nmd.ra (nmdEscMane only): labelFields ncbiIds,name,transcripts;
defaultLabelFields ncbiIds; labelSeparator " / ". Gencode and RefSeq
subtracks unchanged - they default to the gene symbol (name column)
and have an empty ncbiIds column.
- doc/hg38/nmd.txt: bump all three bedToBigBed invocations to bed9+3
and document the --ncbi-id-field 21 + --no-collapse invocation for
MANE.
Counts: MANE 68,028 (--no-collapse); Gencode 233,375; RefSeq 112,356.
- src/hg/makeDb/scripts/primateai/primateAi.as
- lines changed 2, context: html, text, full: html, text
de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover.
Variant analysts typically work at the nucleotide level, and the current
item label (amino acid change) collapses distinguishable variants: ~17%
of items share their (chrom, pos, AA-change) tuple with another item
because of codon degeneracy (e.g. three C>A, C>G, C>T at the same
position can all appear as "M>I"). Labeling by nucleotide change makes
every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on
hg19 from overlapping transcripts).
- primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)";
new field aaChange (placed before ref/alt) holds the amino acid
change.
- primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column,
and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and
a colored prediction string.
- primateAi.ra: add labelFields name,aaChange and defaultLabelFields
name so users can toggle the on-feature label between nt change
(default) and AA change.
- primateAi.html: expand Display Conventions with the label-convention
rationale and a legend for each mouseover field.
refs #37274
- src/hg/makeDb/scripts/primateai/primateAiToBigBed.py
- lines changed 13, context: html, text, full: html, text
50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
QA fixes for PrimateAI-3D track.
Config (primateAi.ra):
- Fix broken Ensembl transcript linkout: urls $S expanded to chromosome
name; switch to the Ensembl transcript page with $$
- Add numeric filters on percentile and raw score (label notes the
paper's 0.821 clinical threshold)
- Add maxWindowToDraw 2000000
Data (primateAiToBigBed.py):
- Change hardcoded strand '+' to '.': the source file has no strand
column
- Accept input/output paths as CLI args (previously hardcoded the hg38
input path)
- Handle variable field count: ~2.4M rows in the hg19 source are
missing the refseq column
Description (primateAi.html):
- Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way
track
- Regenerate the first reference via getTrackReferences (wrong article
number and wrong PMC ID in the previous text)
- Fix the GitHub URL for the conversion script in Methods
- Move the Zoonomia 447-way mention out of Description; rephrase the
license note to describe precisely what is disabled
relatedTracks.ra:
- Add reciprocal cross-links for primateAi <-> alphaMissense (hg38),
primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi
(hg38). Also includes promoterAi <-> alphaMissense cross-links.
refs #37274 #37279
- lines changed 10, context: html, text, full: html, text
de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover.
Variant analysts typically work at the nucleotide level, and the current
item label (amino acid change) collapses distinguishable variants: ~17%
of items share their (chrom, pos, AA-change) tuple with another item
because of codon degeneracy (e.g. three C>A, C>G, C>T at the same
position can all appear as "M>I"). Labeling by nucleotide change makes
every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on
hg19 from overlapping transcripts).
- primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)";
new field aaChange (placed before ref/alt) holds the amino acid
change.
- primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column,
and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and
a colored prediction string.
- primateAi.ra: add labelFields name,aaChange and defaultLabelFields
name so users can toggle the on-feature label between nt change
(default) and AA change.
- primateAi.html: expand Display Conventions with the label-convention
rationale and a legend for each mouseover field.
refs #37274
- src/hg/makeDb/scripts/promoterAiOverlaps.as
- lines changed 6, context: html, text, full: html, text
f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
QA fixes for PromoterAI track. refs #37278
Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D
paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID
40440429), corrected the score-direction wording (negative = under-expression,
positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access
source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover
blurb to match mouseOverFunction noAverage behavior.
Converter and AS: the overlap bigBed now carries the real per-transcript strand
from the source TSV (was hardcoded '+'), with a new strands column in the AS, and
the name field concatenates unique gene symbols so bidirectional-promoter items
read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is
meaningful. Rewrote the converter to stream (sorted input), which drops peak
memory from ~40 GB to a few MB.
trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable
without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the
bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw
red (over-expression) above zero and blue (under-expression) below, matching
the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap
subtrack.
Makedoc updated to describe the streaming pipeline, the new strands column,
and the rebuild workflow.
- src/hg/makeDb/scripts/promoterAiToBigWig.py
- lines changed 111, context: html, text, full: html, text
f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
QA fixes for PromoterAI track. refs #37278
Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D
paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID
40440429), corrected the score-direction wording (negative = under-expression,
positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access
source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover
blurb to match mouseOverFunction noAverage behavior.
Converter and AS: the overlap bigBed now carries the real per-transcript strand
from the source TSV (was hardcoded '+'), with a new strands column in the AS, and
the name field concatenates unique gene symbols so bidirectional-promoter items
read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is
meaningful. Rewrote the converter to stream (sorted input), which drops peak
memory from ~40 GB to a few MB.
trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable
without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the
bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw
red (over-expression) above zero and blue (under-expression) below, matching
the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap
subtrack.
Makedoc updated to describe the streaming pipeline, the new strands column,
and the rebuild workflow.
- src/hg/makeDb/scripts/srSv/abelSv.as
- lines changed 0, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 4, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/srSv/abelSvBuild.sh
- lines changed 7, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/scripts/srSv/abelSvVcfToBed.py
- lines changed 0, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 34, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/srSv/onekg3202Sr.as
- lines changed 0, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 3, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/srSv/onekg3202SrVcfToBed.py
- lines changed 0, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 23, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/scripts/srSv/tommoJpCnvVcfToBedGraph.py
- lines changed 0, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/scripts/varFreqs/databases.tsv
- lines changed 2, context: html, text, full: html, text
366afa4a74c46ec6fb2b667a2902a873feec40cf Mon Apr 20 23:00:05 2026 -0700
varFreqsAll: rebuild combined bigBed to include GA4K and CoLoRSdb
Regenerate the All Databases Combined track with the two long-read
PacBio subtracks (GA4K 552 samples and CoLoRSdb v1.2.0 1,027 samples)
that were added to varFreqs since the March build. Source count rises
from 21 to 23 databases; final bigBed is 37.7 GB with 1.17B records
and 113 fields. Updates varFreqs.ra filterValues.sources and per-
database AF/AC filters for the two new sources, and databases.tsv
+ varFreqs.txt (build notes).
refs #36642
- src/hg/makeDb/scripts/varFreqs/svatalogFreqToVcf.py
- lines changed 137, context: html, text, full: html, text
695f40f9d6139a4df393522c067f1702aff8d3bd Wed Apr 22 03:13:39 2026 -0700
varFreqs: add SVatalog 101 short-read SNV frequencies subtrack
SNV/indel allele frequencies from the 101-sample GWAS SVatalog cohort
(Chirmade et al. 2026, Heredity, PMID 41203876), called from 10X
Genomics linked short-read WGS with GATK HaplotypeCaller v4.0.0.0 and
phased with SHAPEIT v4.2.0. Sibling of the lrSv chirmade101Sv
structural-variant track, which is built from the same 101 samples.
8,814,835 autosomal + chrX sites. Source release ships only AF; AC and
AN are synthesized in the emitted VCF as AC=round(AF*202) and AN=202
(2*101 diploid), with the gnomAD v3.1 non-Finnish European AF and dbSNP
rsID passed through as GNOMAD_NFE_AF and RSID info fields. VCF is
bgzipped + tabix-indexed (172 MB + 1.6 MB .tbi).
Files:
- scripts/varFreqs/svatalogFreqToVcf.py (new): per-chrom allele-freq
TSV -> single VCF with hg38 ##contig header
- trackDb/human/varFreqs.ra: new svatalogSnv vcfTabix subtrack
- trackDb/human/svatalogSnv.html (new): doc page
- trackDb/human/varFreqs.html: new row in Available Datasets table
- doc/hg38/varFreqs.txt: wget-free build block (input files were
downloaded manually from Zenodo 13367574)
Note: the All Databases Combined varFreqs bigBed has NOT been rebuilt
to include this new source yet; a subsequent merge pass will add it.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/abelSv.html
- lines changed 12, context: html, text, full: html, text
65091fe6f6487c23d650a144e947fc1c582d3f40 Tue Apr 21 02:16:16 2026 -0700
abelSv: move under lrSv supertrack as short-read comparison subtrack
Move the Abel et al. 2020 CCDG 17,795-genome SV callset from a
top-level hg38 track to a subtrack of the lrSv supertrack (parallel
to onekg3202Sr) and relabel shortLabel/longLabel to flag Illumina
short-read provenance. The same bigBed is now visible on hg38 in
the long-read SV browsing context. Also:
- Clarify abelSv.html variant counts: 738,624 upstream unique SVs
across both callsets, 737,998 after B37->hg38 liftOver (626
unmapped). B38=458,106, B37lift=279,892.
- lrSv.html: fix triple-slash https:/// in the Ebert et al. Science
reference URL.
- bigBed.html: add closing </li> on the extra-fields pipe-separator
bullet and tighten a comma in the same sentence.
refs #36258, refs #37376
- lines changed 28, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/aou1kSv.html
- lines changed 30, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/aprSv.html
- lines changed 36, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/chirmade101Sv.html
- lines changed 32, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/colorsDb.html
- lines changed 126, context: html, text, full: html, text
f4d6633d6a724d7b682f9f49ed983e22a5e0975d Mon Apr 20 14:41:07 2026 -0700
updating a few lrSv subtracks, and moving the colorsDbSnv track under
the varFreqs track. refs #36642
- src/hg/makeDb/trackDb/human/colorsDbSv.html
- lines changed 104, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 4, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/cpc1Sv.html
- lines changed 23, context: html, text, full: html, text
5e4ca58df1b5bfe554fe5cc3309a39736ca256ee Tue Apr 21 08:08:52 2026 -0700
cpc1Sv: restrict to the 58 CPC samples, drop HPRC-specific SVs
Rewrite lrSvCpc1VcfToBed.py to identify the 58 CPC sample columns by
name prefix (HIFI032* or RY*), recompute AC/AN/NS from those GT
columns only, and skip any snarl that no CPC sample carries. The
HPRC portion is already represented elsewhere in lrSv, so this keeps
the track population-consistent with its label.
Rebuild results: 46,092 snarl sites on hs1 (down from 97,205 when
combined with HPRC), 36,030 lifted to hg38 (down from 81,261;
10,062 unmapped). Updates cpc1Sv.html, lrSv.ra labels, and the
makeDoc.
refs #36258
- lines changed 36, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/decodeSv.html
- lines changed 26, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/ga4kSv.html
- lines changed 29, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/gustafsonSv.html
- lines changed 96, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 30, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/han945Sv.html
- lines changed 10, context: html, text, full: html, text
f4d6633d6a724d7b682f9f49ed983e22a5e0975d Mon Apr 20 14:41:07 2026 -0700
updating a few lrSv subtracks, and moving the colorsDbSnv track under
the varFreqs track. refs #36642
- lines changed 1, context: html, text, full: html, text
b132fa0f0ba226c056f5408a2c02c3d945ccc9c1 Tue Apr 21 08:17:52 2026 -0700
han945Sv: fix OMIX download link
Replace the broken ngdc.cncb.ac.cn/omix/release/OMIX005649 URL in the
Data Access section with the correct biosino.org analysis page for
this dataset.
refs #36258
- lines changed 26, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/hg19/trackDb.gencode.ra
- lines changed 1, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/human/hg19/wgEncodeGencodeV50lift37.html
- lines changed 32, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/human/hg19/wgEncodeGencodeV50lift37.ra
- lines changed 210, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/human/hg38/abelSv.ra
- lines changed 37, context: html, text, full: html, text
65091fe6f6487c23d650a144e947fc1c582d3f40 Tue Apr 21 02:16:16 2026 -0700
abelSv: move under lrSv supertrack as short-read comparison subtrack
Move the Abel et al. 2020 CCDG 17,795-genome SV callset from a
top-level hg38 track to a subtrack of the lrSv supertrack (parallel
to onekg3202Sr) and relabel shortLabel/longLabel to flag Illumina
short-read provenance. The same bigBed is now visible on hg38 in
the long-read SV browsing context. Also:
- Clarify abelSv.html variant counts: 738,624 upstream unique SVs
across both callsets, 737,998 after B37->hg38 liftOver (626
unmapped). B38=458,106, B37lift=279,892.
- lrSv.html: fix triple-slash https:/// in the Ebert et al. Science
reference URL.
- bigBed.html: add closing </li> on the extra-fields pipe-separator
bullet and tighten a comma in the same sentence.
refs #36258, refs #37376
- src/hg/makeDb/trackDb/human/hg38/gnomad.ra
- lines changed 2, context: html, text, full: html, text
90668959b76ef7d792bac3cbc4eb509a58c6c317 Thu Apr 23 11:25:38 2026 -0700
Adding mergeSpannedItems to gnomad CNV and structural variants tracks on hg38, refs Max email
- src/hg/makeDb/trackDb/human/hg38/gnomadMpc.html
- lines changed 175, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/hg38/gnomadMpc.ra
- lines changed 79, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/hg38/mb2.ra
- lines changed 1, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- src/hg/makeDb/trackDb/human/hg38/mpra.html
- lines changed 13, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/trackDb/human/hg38/mpra.ra
- lines changed 27, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/trackDb/human/hg38/mpraVarDb.html
- lines changed 14, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/trackDb/human/hg38/mprabase.html
- lines changed 9, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/trackDb/human/hg38/nmd.html
- lines changed 4, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 5, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- src/hg/makeDb/trackDb/human/hg38/nmd.ra
- lines changed 8, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 13, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 2, context: html, text, full: html, text
fe73446acf43f70e385dadbbb281634adf3cac9e Tue Apr 21 16:44:16 2026 -0700
NMD Escape QA tweaks: hide Gencode subtrack by default, bold rule numbers in mouseovers. refs #33737
- nmdEscGencode default visibility changed from on/dense to off/hide so
only the RefSeq Curated subtrack is on by default. Per Lou's request.
- RULE_DESCRIPTIONS mouseover strings wrap the rule number in <b>...</b>
so the rule shows bold in the tooltip. Both bigBeds rebuilt.
- lines changed 2, context: html, text, full: html, text
a86b49667ad82b0f6c3745379f186f4d5753e368 Wed Apr 22 13:52:14 2026 -0700
Simplify NMD Escape subtrack longLabels. refs #33737
The '50bp/100bp/intronless/400nt' rule-list became inaccurate after the
Rule 2 refinement (Rule 2 now covers single coding exon + no 3'UTR
intron, not just intronless). Drop the enumerated rules from the
longLabel and defer to the track description page for rule detail.
- lines changed 21, context: html, text, full: html, text
0151d00a4a1d73a78c35f6158c6c936ff338faeb Fri Apr 24 10:37:34 2026 -0700
NMD Escape: MANE subtrack, Rule 1 bug fix, transcript filter. refs #33737
- Add nmdEscMane subtrack (MANE Select Plus Clinical 1.5), built from
/gbdb/hg38/mane/mane.bb. Reuses nmdEscTranscripts.html.
- Fix Rule 1: measure 50 bp upstream of the transcript's last splice
junction (including 3'UTR introns) rather than stripping 3'UTR from
the exon list first. The old logic painted the entire last CDS exon
as NMD-escape whenever the transcript had only one CDS exon, even
when a 3'UTR intron sat far past the stop codon (e.g. NBDY: 207 bp
of CDS over-painted for a junction 2.6 kb past the stop).
- Add --rule1-mode {cds,mrna} (default cds): cds counts only CDS bp
on the walk-back (paints up to 50 bp of CDS matching the rule label
literally); mrna counts mRNA bp and clips to CDS (tracks the 55 bp
rule literature). Documented in makeDoc.
- Rule 4: when a 3'UTR intron exists, the last CDS-containing exon
has a downstream EJC and is now eligible for the long-exon rule.
- Mouseover lists contributing transcript accessions when 1-3 items
collapse into a region; falls back to a count above that.
- Add filterText/filterType/filterLabel on all three escape subtracks
so a user can narrow the display to one transcript.
- genePredNmdEsc: --gene-sym-field (default 17 for Gencode; pass 18
for MANE, whose HGNC symbol lives in bigGenePred geneName2).
- Add findShortTxLongUtrIntron.py helper for finding MANE transcripts
with long UTR introns (used to pick NMD edge-case test cases).
Post-fix collapsed-region counts (--rule1-mode=cds):
MANE 1.5: 67,752
Gencode V49: 233,375
RefSeq Curated: 112,356
- lines changed 7, context: html, text, full: html, text
3a62ea7e9a8cb3503586a0a78570331308c9bc58 Mon Apr 27 02:23:00 2026 -0700
NMD Escape MANE: expose NM_ accession via labelFields. refs #33737
Per QA, the MANE subtrack now shows the NCBI RefSeq accession by default
instead of the HGNC gene symbol, with the ENST and gene symbol still
selectable via labelFields.
- genePredNmdEsc: new --ncbi-id-field N option (default -1 = unused).
When set, the named bigGenePred column is captured per-transcript and
written into a new ncbiIds output column. For MANE pass 21.
- genePredNmdEsc: new --no-collapse option. By default, regions with
identical (chrom, start, end, rule) from multiple transcripts collapse
into one row with comma-separated lists. With --no-collapse the script
emits one row per (transcript, region). Used for MANE so each
label-field column holds a single value: the 74 MANE Plus Clinical
genes (e.g. LMNA) get two rows per region instead of one row with a
two-element list.
- nmdEscCollapsed.as: add lstring ncbiIds column. Schema is now bed9+3.
- nmd.ra (nmdEscMane only): labelFields ncbiIds,name,transcripts;
defaultLabelFields ncbiIds; labelSeparator " / ". Gencode and RefSeq
subtracks unchanged - they default to the gene symbol (name column)
and have an empty ncbiIds column.
- doc/hg38/nmd.txt: bump all three bedToBigBed invocations to bed9+3
and document the --ncbi-id-field 21 + --no-collapse invocation for
MANE.
Counts: MANE 68,028 (--no-collapse); Gencode 233,375; RefSeq 112,356.
- src/hg/makeDb/trackDb/human/hg38/nmdDetective.html
- lines changed 2, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- src/hg/makeDb/trackDb/human/hg38/nmdEscTranscripts.html
- lines changed 36, context: html, text, full: html, text
3972ba54c468ace338d4a5578de1d20bf6c1f9ec Mon Apr 20 15:39:26 2026 -0700
Adding Rule 4 (long-exon rule, Lindeboom 2016) to NMD Escape tracks and releasing on Apr. 22, 2026. refs #33737
Script: added a fourth rule to genePredNmdEsc. Coding exons longer than
400 bp (excluding the last coding exon, which is already covered by the
50 bp rule) are flagged as NMD-escape regions. Rebuilt the Gencode and
NCBI RefSeq bigBed files.
trackDb:
- nmd.ra: appended "/400nt" to the nmdEsc longLabels, set nmdEscGencode
default visibility to dense so the track is visible in cart-reset
views, changed all four NMDetective subtracks from "visibility full"
to "visibility hide", updated pennantIcon to the Apr. 22, 2026
release date and anchor.
- nmd.html: mention long internal exons in the overview description,
update the rule count from three to four.
- nmdEscTranscripts.html: add the long-exon rule to the rule list and
color legend (gold, #FFD700), expand the Background section with
mechanisms for the intronless, start-proximal, and long-exon rules,
correct the 50 bp rule description to include the entire last coding
exon, fix Lindeboom 2016 author initials (RG -> RGH).
News:
- newsarch.html: add the 2026-04-22 NMD Escape news entry covering all
four rules, with acknowledgements to Guido Neidhardt and Andreas
Lahner for suggesting the track and the Decipher Genome Browser team
for inspiring the visualization.
- indexNews.html: add the front-page news link.
makedoc:
- nmd.txt: dated note for the Rule 4 rebuild.
- lines changed 8, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 16, context: html, text, full: html, text
34d2eee845f5f45e571d1e153c632683b8a93f75 Tue Apr 21 16:17:53 2026 -0700
Refine NMD Escape Rule 2 gate to "single coding exon and no 3'UTR intron". refs #33737
Previously Rule 2 required exonCount==1 (truly intronless). This
overcorrected for single-CDS-exon transcripts whose only introns are in
the 5'UTR: biologically these have no EJC downstream of the stop codon
(5'UTR EJCs are cleared by the scanning 40S or sit upstream of the
terminating ribosome) and are NMD-immune, but the code pushed them to
Rules 1/3 under a less accurate "last coding exon" label.
New gate: len(cdsExons) == 1 AND no exon-exon junction strictly
downstream of the stop codon (strand-aware). Transcripts with a single
coding exon but a 3'UTR intron correctly stay in Rules 1/3 because that
intron deposits an EJC that can trigger NMD.
3,113 RefSeq Curated and 10,790 Gencode V49 transcripts move into Rule
2. 140 RefSeq and 1,135 Gencode single-CDS-exon transcripts with 3'UTR
introns correctly remain in Rules 1/3. Description page and makedoc
updated.
- lines changed 18, context: html, text, full: html, text
0151d00a4a1d73a78c35f6158c6c936ff338faeb Fri Apr 24 10:37:34 2026 -0700
NMD Escape: MANE subtrack, Rule 1 bug fix, transcript filter. refs #33737
- Add nmdEscMane subtrack (MANE Select Plus Clinical 1.5), built from
/gbdb/hg38/mane/mane.bb. Reuses nmdEscTranscripts.html.
- Fix Rule 1: measure 50 bp upstream of the transcript's last splice
junction (including 3'UTR introns) rather than stripping 3'UTR from
the exon list first. The old logic painted the entire last CDS exon
as NMD-escape whenever the transcript had only one CDS exon, even
when a 3'UTR intron sat far past the stop codon (e.g. NBDY: 207 bp
of CDS over-painted for a junction 2.6 kb past the stop).
- Add --rule1-mode {cds,mrna} (default cds): cds counts only CDS bp
on the walk-back (paints up to 50 bp of CDS matching the rule label
literally); mrna counts mRNA bp and clips to CDS (tracks the 55 bp
rule literature). Documented in makeDoc.
- Rule 4: when a 3'UTR intron exists, the last CDS-containing exon
has a downstream EJC and is now eligible for the long-exon rule.
- Mouseover lists contributing transcript accessions when 1-3 items
collapse into a region; falls back to a count above that.
- Add filterText/filterType/filterLabel on all three escape subtracks
so a user can narrow the display to one transcript.
- genePredNmdEsc: --gene-sym-field (default 17 for Gencode; pass 18
for MANE, whose HGNC symbol lives in bigGenePred geneName2).
- Add findShortTxLongUtrIntron.py helper for finding MANE transcripts
with long UTR introns (used to pick NMD edge-case test cases).
Post-fix collapsed-region counts (--rule1-mode=cds):
MANE 1.5: 67,752
Gencode V49: 233,375
RefSeq Curated: 112,356
- src/hg/makeDb/trackDb/human/hg38/trackDb.gencode.ra
- lines changed 1, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/human/hg38/trackDb.ra
- lines changed 2, context: html, text, full: html, text
65091fe6f6487c23d650a144e947fc1c582d3f40 Tue Apr 21 02:16:16 2026 -0700
abelSv: move under lrSv supertrack as short-read comparison subtrack
Move the Abel et al. 2020 CCDG 17,795-genome SV callset from a
top-level hg38 track to a subtrack of the lrSv supertrack (parallel
to onekg3202Sr) and relabel shortLabel/longLabel to flag Illumina
short-read provenance. The same bigBed is now visible on hg38 in
the long-read SV browsing context. Also:
- Clarify abelSv.html variant counts: 738,624 upstream unique SVs
across both callsets, 737,998 after B37->hg38 liftOver (626
unmapped). B38=458,106, B37lift=279,892.
- lrSv.html: fix triple-slash https:/// in the Ebert et al. Science
reference URL.
- bigBed.html: add closing </li> on the extra-fields pipe-separator
bullet and tighten a comma in the same sentence.
refs #36258, refs #37376
- lines changed 1, context: html, text, full: html, text
33e9019ef1b239ca1ab8114818f09ad65f58f2d0 Wed Apr 22 13:10:23 2026 -0700
Release NMD Escape supertrack to beta + public. refs #33737
Drop the 'alpha' gate on include nmd.ra in hg38 trackDb.ra so the
supertrack flows through the trackDb push pipeline to hgwbeta and the
RR. /gbdb/hg38/nmd/*.bb files are already on the RR.
- lines changed 1, context: html, text, full: html, text
f30798ae5d11e88e0ab7eb2bcab634e253fd0675 Thu Apr 23 10:36:40 2026 -0700
Add gnomAD MPC v4.1.1 track to hg38.
New composite track under the gnomAD container showing per-variant
MPC (Missense deleteriousness Prediction by Constraint) scores from
gnomAD v4.1.1. Four bigWigs provide per-base scores (one per ALT
nucleotide); a companion bigBed carries the ~250K multi-transcript
variants with a per-transcript breakdown. Included via 'alpha' for
QA review. refs #37434
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/hg38/wgEncodeGencodeV50.html
- lines changed 30, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/human/hg38/wgEncodeGencodeV50.ra
- lines changed 247, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/human/hg38/wgEncodeReg4Epigenetics.ra
- lines changed 1, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- lines changed 1, context: html, text, full: html, text
e97b93d010dad47ba66947bcd40501d99b1545d5 Thu Apr 23 09:06:43 2026 -0700
Using correct URL for ENCODE 4 experiment links, refs #36320
- src/hg/makeDb/trackDb/human/hg38/wgEncodeReg4RnaSeq.ra
- lines changed 1, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- src/hg/makeDb/trackDb/human/hg38/wgEncodeReg4TfChip.ra
- lines changed 1, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- lines changed 1, context: html, text, full: html, text
e75f51e55d0464a3e659a16a19514fdb9b33bdd9 Wed Apr 22 09:51:32 2026 -0700
Also adding Experiment linkout to the EncodeReg4TfChip faceted composite, refs #36320
- lines changed 1, context: html, text, full: html, text
e97b93d010dad47ba66947bcd40501d99b1545d5 Thu Apr 23 09:06:43 2026 -0700
Using correct URL for ENCODE 4 experiment links, refs #36320
- src/hg/makeDb/trackDb/human/hgsvc2Sv.html
- lines changed 122, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 29, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/hgsvc3Sv.html
- lines changed 27, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 38, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/hprc2Sv.html
- lines changed 96, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 37, context: html, text, full: html, text
40c7b6fb506ddde686cd56a976b8a07e46db775b Tue Apr 21 08:19:46 2026 -0700
hprc2Sv: highlight alignments_v2.0.csv as authoritative file list
Reframe the existing link to HPRC's alignments_v2.0.csv in the
Methods section so it is clear that this CSV points to both the
GRCh38 and CHM13 pangenome VCFs used for the track (not just the
list of underlying assemblies). Existing S3 download links preserved.
refs #36258
- lines changed 31, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/kwanhoSv.html
- lines changed 9, context: html, text, full: html, text
989d891c0c0a500e584d55f2f368b52f2abe5f1d Tue Apr 21 06:25:48 2026 -0700
kwanhoSv: flag as preliminary, ask users to contact authors
Mark the Kim et al. 2026 PD brain long-read SV subtrack as
preliminary in its shortLabel and longLabel, and add a prominent
warning banner at the top of the description page telling users to
contact the authors (ASAP / Kim lab) before using the data, since
the callset will be updated before publication.
refs #36258
- lines changed 36, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/longReadVariants.ra
- lines changed 1, context: html, text, full: html, text
96e0a50a6dc4cba147be2dfed06775ccabf25378 Fri Apr 24 15:26:48 2026 -0700
Fixing the bigDataUrl since the track disappeared on the RR, refs #36258
- src/hg/makeDb/trackDb/human/lrSv.html
- lines changed 143, context: html, text, full: html, text
f4d6633d6a724d7b682f9f49ed983e22a5e0975d Mon Apr 20 14:41:07 2026 -0700
updating a few lrSv subtracks, and moving the colorsDbSnv track under
the varFreqs track. refs #36642
- lines changed 1, context: html, text, full: html, text
65091fe6f6487c23d650a144e947fc1c582d3f40 Tue Apr 21 02:16:16 2026 -0700
abelSv: move under lrSv supertrack as short-read comparison subtrack
Move the Abel et al. 2020 CCDG 17,795-genome SV callset from a
top-level hg38 track to a subtrack of the lrSv supertrack (parallel
to onekg3202Sr) and relabel shortLabel/longLabel to flag Illumina
short-read provenance. The same bigBed is now visible on hg38 in
the long-read SV browsing context. Also:
- Clarify abelSv.html variant counts: 738,624 upstream unique SVs
across both callsets, 737,998 after B37->hg38 liftOver (626
unmapped). B38=458,106, B37lift=279,892.
- lrSv.html: fix triple-slash https:/// in the Ebert et al. Science
reference URL.
- bigBed.html: add closing </li> on the extra-fields pipe-separator
bullet and tighten a comma in the same sentence.
refs #36258, refs #37376
- lines changed 21, context: html, text, full: html, text
06a482a2120d4d85c7c34fb5038213e07f595554 Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 33, context: html, text, full: html, text
151410cc48b9b1f8b1cb9bee89b7004eca871c61 Wed Apr 22 09:03:35 2026 -0700
lrSv: harmonize long-read shortLabels, add aprSv/cpc1Sv/abelSv to overview
Normalize the shortLabel text of every long-read subtrack to the pattern
"<Cohort> <N> SVs" (no commas in N): CoLoRSdb 1427, AoU 1027, ToMMo 333,
GA4K 502, deCODE 3622, HPRC v2 233, Kim PD 100 prelim. Short-read
comparators (abelSv, onekg3202Sr, tommoJpCnv) are left alone per user
instruction.
Also add three rows that were missing from lrSv.html's overview table:
aprSv (Arab APR 53), cpc1Sv (CPC 58, HPRC-specific SVs removed) and
abelSv (CCDG 17,795 Illumina short-read comparator). Updates the
comparator footnote to mention both short-read rows.
refs #36258
- lines changed 50, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/trackDb/human/lrSv.ra
- lines changed 39, context: html, text, full: html, text
65091fe6f6487c23d650a144e947fc1c582d3f40 Tue Apr 21 02:16:16 2026 -0700
abelSv: move under lrSv supertrack as short-read comparison subtrack
Move the Abel et al. 2020 CCDG 17,795-genome SV callset from a
top-level hg38 track to a subtrack of the lrSv supertrack (parallel
to onekg3202Sr) and relabel shortLabel/longLabel to flag Illumina
short-read provenance. The same bigBed is now visible on hg38 in
the long-read SV browsing context. Also:
- Clarify abelSv.html variant counts: 738,624 upstream unique SVs
across both callsets, 737,998 after B37->hg38 liftOver (626
unmapped). B38=458,106, B37lift=279,892.
- lrSv.html: fix triple-slash https:/// in the Ebert et al. Science
reference URL.
- bigBed.html: add closing </li> on the extra-fields pipe-separator
bullet and tighten a comma in the same sentence.
refs #36258, refs #37376
- lines changed 2, context: html, text, full: html, text
989d891c0c0a500e584d55f2f368b52f2abe5f1d Tue Apr 21 06:25:48 2026 -0700
kwanhoSv: flag as preliminary, ask users to contact authors
Mark the Kim et al. 2026 PD brain long-read SV subtrack as
preliminary in its shortLabel and longLabel, and add a prominent
warning banner at the top of the description page telling users to
contact the authors (ASAP / Kim lab) before using the data, since
the callset will be updated before publication.
refs #36258
- lines changed 3, context: html, text, full: html, text
5e4ca58df1b5bfe554fe5cc3309a39736ca256ee Tue Apr 21 08:08:52 2026 -0700
cpc1Sv: restrict to the 58 CPC samples, drop HPRC-specific SVs
Rewrite lrSvCpc1VcfToBed.py to identify the 58 CPC sample columns by
name prefix (HIFI032* or RY*), recompute AC/AN/NS from those GT
columns only, and skip any snarl that no CPC sample carries. The
HPRC portion is already represented elsewhere in lrSv, so this keeps
the track population-consistent with its label.
Rebuild results: 46,092 snarl sites on hs1 (down from 97,205 when
combined with HPRC), 36,030 lifted to hg38 (down from 81,261;
10,062 unmapped). Updates cpc1Sv.html, lrSv.ra labels, and the
makeDoc.
refs #36258
- lines changed 33, context: html, text, full: html, text
06a482a2120d4d85c7c34fb5038213e07f595554 Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 161, context: html, text, full: html, text
1732661494ece5e645a9522f15a0f5922b035d1a Wed Apr 22 08:57:11 2026 -0700
colorsDbSv: rebuild from pbsv+Jasmine source VCFs with richer AS
Rebuild the CoLoRSdb SV bigBeds for hg38 and hs1 from the upstream
pbsv+Jasmine VCFs that the CoLoRSdb project distributes directly.
The previous bigBed stored AF as a string (breaking the numeric
filter) and lacked insLen (causing a "filter on field insLen not in
AS file" error under the supertrack-level filter). The new build:
- stores AF as a float
- adds a derived insLen column (alt-ref length delta for INS, 0
otherwise) so the shared lrSv insLen filter applies
- keeps every INFO field from the source (SVTYPE, SVLEN, END, AC,
AN, NS, AC_Hom, AC_Het, AC_Hemi, AF, HWE, ExcHet, nhomalt) plus
REF/ALT
- uses the canonical svName(TYPE, featLen, AC) label via lrSvCommon
Record counts match the source VCFs: 426,239 on hg38 (59 MB) and
839,714 on hs1 (87 MB). /gbdb symlinks unchanged. The trackDb
colorsDbSv stanza is updated to reference the new AS field names
(acHom/acHet/acHemi, AF, AN) and to add the insLen filter. Also
fixes a nearby `version 1.1` -> `dataVersion 1.1` typo in
lrSv1kgOnt that was failing the tagTypes check.
refs #36258
- lines changed 7, context: html, text, full: html, text
151410cc48b9b1f8b1cb9bee89b7004eca871c61 Wed Apr 22 09:03:35 2026 -0700
lrSv: harmonize long-read shortLabels, add aprSv/cpc1Sv/abelSv to overview
Normalize the shortLabel text of every long-read subtrack to the pattern
"<Cohort> <N> SVs" (no commas in N): CoLoRSdb 1427, AoU 1027, ToMMo 333,
GA4K 502, deCODE 3622, HPRC v2 233, Kim PD 100 prelim. Short-read
comparators (abelSv, onekg3202Sr, tommoJpCnv) are left alone per user
instruction.
Also add three rows that were missing from lrSv.html's overview table:
aprSv (Arab APR 53), cpc1Sv (CPC 58, HPRC-specific SVs removed) and
abelSv (CCDG 17,795 Illumina short-read comparator). Updates the
comparator footnote to mention both short-read rows.
refs #36258
- lines changed 114, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 1, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/lrSv1kgOnt.html
- lines changed 38, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/nmdEscape.ra
- lines changed 22, context: html, text, full: html, text
875dc5de11d5e0eb51e94cb37d5b74da447a6c7b Fri Apr 24 06:17:18 2026 -0700
removing old nmdEscape track from tdb, no redmine
- src/hg/makeDb/trackDb/human/onekg3202Sr.html
- lines changed 112, context: html, text, full: html, text
9a11061ca6b40fe16bdfd09b1af53192f6c7c85b Tue Apr 21 08:13:02 2026 -0700
lrSv: add HTML doc pages and conversion scripts for recent subtracks, + hs1 HGSVC3
Subtrack stanzas for these SV callsets landed in earlier commits but
the conversion scripts and per-track HTML description pages were
never added; trackDb therefore had no doc to serve. This commit
catches up.
Docs (new):
- colorsDbSv.html CoLoRSdb 1,427-sample long-read SVs
- gustafsonSv.html 1KG ONT 100 (Gustafson 2024, PMID 39358015)
- hgsvc2Sv.html HGSVC2 (Ebert 2021, PMID 33632895)
- hprc2Sv.html HPRC release-2 pangenome SVs (no PMID yet;
see humanpangenome.org/hprc-data-release-2/)
- onekg3202Sr.html 1KG 3202 Illumina SHORT-READ GATK-SV
(Byrska-Bishop 2022, PMID 36055201)
Scripts (new):
- lrSvGustafson.as / lrSvGustafsonVcfToBed.py
- lrSvHgsvc2.as / lrSvHgsvc2TsvToBed.py (merges insdel + inv tables)
- lrSvHprc2.as / lrSvHprc2VcfToBed.py (streams wave-decomposed VCF,
explodes multi-allelic rows,
filters to SV-sized or INV)
- lrSv1kg3202Sr.as / lrSv1kg3202SrVcfToBed.py
HGSVC3 also on hs1:
- hgsvc3Sv.html: note that the hs1 build is native (not lifted):
HGSVC3 aligned all assemblies to both GRCh38 and T2T-CHM13 and
released separate annotation tables per reference. Added the
T2T-CHM13 source URL to the Methods section and the hs1 hgsvc3.bb
download link to Data Access.
- doc/hs1/lrSv.txt (new): hs1-specific wget + build steps; refers
back to doc/hg38/lrSv.txt for the full process.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 36, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/primateAi.html
- lines changed 16, context: html, text, full: html, text
50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
QA fixes for PrimateAI-3D track.
Config (primateAi.ra):
- Fix broken Ensembl transcript linkout: urls $S expanded to chromosome
name; switch to the Ensembl transcript page with $$
- Add numeric filters on percentile and raw score (label notes the
paper's 0.821 clinical threshold)
- Add maxWindowToDraw 2000000
Data (primateAiToBigBed.py):
- Change hardcoded strand '+' to '.': the source file has no strand
column
- Accept input/output paths as CLI args (previously hardcoded the hg38
input path)
- Handle variable field count: ~2.4M rows in the hg19 source are
missing the refseq column
Description (primateAi.html):
- Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way
track
- Regenerate the first reference via getTrackReferences (wrong article
number and wrong PMC ID in the previous text)
- Fix the GitHub URL for the conversion script in Methods
- Move the Zoonomia 447-way mention out of Description; rephrase the
license note to describe precisely what is disabled
relatedTracks.ra:
- Add reciprocal cross-links for primateAi <-> alphaMissense (hg38),
primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi
(hg38). Also includes promoterAi <-> alphaMissense cross-links.
refs #37274 #37279
- lines changed 28, context: html, text, full: html, text
de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover.
Variant analysts typically work at the nucleotide level, and the current
item label (amino acid change) collapses distinguishable variants: ~17%
of items share their (chrom, pos, AA-change) tuple with another item
because of codon degeneracy (e.g. three C>A, C>G, C>T at the same
position can all appear as "M>I"). Labeling by nucleotide change makes
every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on
hg19 from overlapping transcripts).
- primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)";
new field aaChange (placed before ref/alt) holds the amino acid
change.
- primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column,
and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and
a colored prediction string.
- primateAi.ra: add labelFields name,aaChange and defaultLabelFields
name so users can toggle the on-feature label between nt change
(default) and AA change.
- primateAi.html: expand Display Conventions with the label-convention
rationale and a legend for each mouseover field.
refs #37274
- lines changed 11, context: html, text, full: html, text
30374e3fc3390902c35bb463510567f1b6f7a96e Wed Apr 22 13:44:44 2026 -0700
PrimateAI-3D: clarify origin of the 0.821 threshold per Max. refs #37274
Description previously juxtaposed the paper's 0.821 clinical threshold
with the 75/25 benign/pathogenic split in a way that implied the two
were related. Per Max on the ticket: the 0.821 threshold comes from
Gao et al. 2023 Fig. 5A (calibrated against de novo missense excess
in a clinical cohort, n=7,238 pathogenic calls), and the "prediction"
column values are Illumina's own calls — not a simple application of
the 0.821 threshold (some variants below it are labeled pathogenic and
vice versa).
- lines changed 7, context: html, text, full: html, text
6e61d3349b36cbcc01500c1483cc7bfbc141d9ea Wed Apr 22 13:47:33 2026 -0700
PrimateAI-3D: tighten 0.821 threshold wording per the paper. refs #37274
Confirmed against Gao 2023 (PMC10713091): the calibration cohort is the
Deciphering Developmental Disorders (DDD) neurodevelopmental cohort, not
ClinVar. The cutoff was chosen so that the count of pathogenic calls
(n=7,238) matched the excess of de novo missense mutations above the
trinucleotide background expectation in that cohort.
- src/hg/makeDb/trackDb/human/primateAi.ra
- lines changed 10, context: html, text, full: html, text
50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
QA fixes for PrimateAI-3D track.
Config (primateAi.ra):
- Fix broken Ensembl transcript linkout: urls $S expanded to chromosome
name; switch to the Ensembl transcript page with $$
- Add numeric filters on percentile and raw score (label notes the
paper's 0.821 clinical threshold)
- Add maxWindowToDraw 2000000
Data (primateAiToBigBed.py):
- Change hardcoded strand '+' to '.': the source file has no strand
column
- Accept input/output paths as CLI args (previously hardcoded the hg38
input path)
- Handle variable field count: ~2.4M rows in the hg19 source are
missing the refseq column
Description (primateAi.html):
- Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way
track
- Regenerate the first reference via getTrackReferences (wrong article
number and wrong PMC ID in the previous text)
- Fix the GitHub URL for the conversion script in Methods
- Move the Zoonomia 447-way mention out of Description; rephrase the
license note to describe precisely what is disabled
relatedTracks.ra:
- Add reciprocal cross-links for primateAi <-> alphaMissense (hg38),
primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi
(hg38). Also includes promoterAi <-> alphaMissense cross-links.
refs #37274 #37279
- lines changed 2, context: html, text, full: html, text
de2ccf6d827865f11d3c8edd9ceeb1b6394a7380 Tue Apr 21 18:22:59 2026 -0700
PrimateAI-3D: label items by nucleotide change, add aaChange field and HTML mouseover.
Variant analysts typically work at the nucleotide level, and the current
item label (amino acid change) collapses distinguishable variants: ~17%
of items share their (chrom, pos, AA-change) tuple with another item
because of codon degeneracy (e.g. three C>A, C>G, C>T at the same
position can all appear as "M>I"). Labeling by nucleotide change makes
every item uniquely distinguishable (0.0% collisions on hg38, 0.1% on
hg19 from overlapping transcripts).
- primateAi.as: field 4 (name) is now "Nucleotide change (e.g. T>C)";
new field aaChange (placed before ref/alt) holds the amino acid
change.
- primateAiToBigBed.py: write name = "{ref}>{alt}", new aaChange column,
and an HTML mouseover with terse labels (Var/AA/Score/Perc/Pred) and
a colored prediction string.
- primateAi.ra: add labelFields name,aaChange and defaultLabelFields
name so users can toggle the on-feature label between nt change
(default) and AA change.
- primateAi.html: expand Display Conventions with the label-convention
rationale and a legend for each mouseover field.
refs #37274
- lines changed 1, context: html, text, full: html, text
d07e0de4fba2fc825dd1fdaa37a7cf1f66e4721d Fri Apr 24 17:36:42 2026 -0700
PrimateAI-3D: move /gbdb dir to _primateAi/ to match the underscore-prefix exclusion rule for hgdownload sync. refs #37274
- src/hg/makeDb/trackDb/human/promoterAi.html
- lines changed 51, context: html, text, full: html, text
f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
QA fixes for PromoterAI track. refs #37278
Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D
paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID
40440429), corrected the score-direction wording (negative = under-expression,
positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access
source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover
blurb to match mouseOverFunction noAverage behavior.
Converter and AS: the overlap bigBed now carries the real per-transcript strand
from the source TSV (was hardcoded '+'), with a new strands column in the AS, and
the name field concatenates unique gene symbols so bidirectional-promoter items
read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is
meaningful. Rewrote the converter to stream (sorted input), which drops peak
memory from ~40 GB to a few MB.
trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable
without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the
bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw
red (over-expression) above zero and blue (under-expression) below, matching
the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap
subtrack.
Makedoc updated to describe the streaming pipeline, the new strands column,
and the rebuild workflow.
- lines changed 16, context: html, text, full: html, text
6c567fd9a03e87610681a43d2183ebb43547d1ad Fri Apr 24 17:58:57 2026 -0700
PromoterAI: review followups. refs #37278
Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the
underscore-prefix exclusion rule for hgdownload sync (same pattern as
PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated.
Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track
default of 20 is too cramped for a signed -1..+1 score.
Description page: drop the wrong primateai3d.basespace.illumina.com link
in Data Access; PromoterAI is not on BaseSpace, it's distributed via the
license agreement on the GitHub page (a download link is emailed after
submission). Reword Data Access and Methods accordingly.
Description page: add Illumina's recommended interpretation thresholds
(|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with
a note that higher cutoffs select smaller, higher-confidence sets.
- src/hg/makeDb/trackDb/human/promoterAi.ra
- lines changed 24, context: html, text, full: html, text
f9a89b0e1ce3c937b4fbb879736c1619c35c271f Tue Apr 21 12:11:02 2026 -0700
QA fixes for PromoterAI track. refs #37278
Description page: replaced the wrong reference (Gao et al. 2023, the PrimateAI-3D
paper) with the actual PromoterAI citation (Jaganathan et al. Science 2025, PMID
40440429), corrected the score-direction wording (negative = under-expression,
positive = over-expression, not "tolerated vs disruptive"), fixed the Data Access
source link (Illumina BaseSpace, not the GitHub repo), and corrected the mouseover
blurb to match mouseOverFunction noAverage behavior.
Converter and AS: the overlap bigBed now carries the real per-transcript strand
from the source TSV (was hardcoded '+'), with a new strands column in the AS, and
the name field concatenates unique gene symbols so bidirectional-promoter items
read as "HES4,ISG15" etc. BED score is now |PromoterAI|*1000 so scoreFilter is
meaningful. Rewrote the converter to stream (sorted input), which drops peak
memory from ~40 GB to a few MB.
trackDb: added filterLabel/filterLimits on scoreDiff (the filter was unusable
without labels), scoreFilter + scoreLabel, alwaysZero and autoScale off on the
bigWig subtracks, color 200,0,0 / altColor 0,0,200 so signed bigWig bars draw
red (over-expression) above zero and blue (under-expression) below, matching
the overlap track itemRgb. Added maxWindowToDraw and maxItems on the overlap
subtrack.
Makedoc updated to describe the streaming pipeline, the new strands column,
and the rebuild workflow.
- lines changed 9, context: html, text, full: html, text
6c567fd9a03e87610681a43d2183ebb43547d1ad Fri Apr 24 17:58:57 2026 -0700
PromoterAI: review followups. refs #37278
Move /gbdb/hg38/promoterAi/ to /gbdb/hg38/_promoterAi/ to match the
underscore-prefix exclusion rule for hgdownload sync (same pattern as
PrimateAI-3D under refs #37274). bigDataUrls and the makedoc updated.
Bump bigWig maxHeightPixels from 128:20:8 to 128:40:8 -- the peer-track
default of 20 is too cramped for a signed -1..+1 score.
Description page: drop the wrong primateai3d.basespace.illumina.com link
in Data Access; PromoterAI is not on BaseSpace, it's distributed via the
license agreement on the GitHub page (a download link is emailed after
submission). Reword Data Access and Methods accordingly.
Description page: add Illumina's recommended interpretation thresholds
(|score| >= 0.1, >= 0.2, >= 0.5) from the PromoterAI GitHub README, with
a note that higher cutoffs select smaller, higher-confidence sets.
- src/hg/makeDb/trackDb/human/srSv.html
- lines changed 105, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- src/hg/makeDb/trackDb/human/srSv.ra
- lines changed 137, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 1, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/svatalogSnv.html
- lines changed 95, context: html, text, full: html, text
695f40f9d6139a4df393522c067f1702aff8d3bd Wed Apr 22 03:13:39 2026 -0700
varFreqs: add SVatalog 101 short-read SNV frequencies subtrack
SNV/indel allele frequencies from the 101-sample GWAS SVatalog cohort
(Chirmade et al. 2026, Heredity, PMID 41203876), called from 10X
Genomics linked short-read WGS with GATK HaplotypeCaller v4.0.0.0 and
phased with SHAPEIT v4.2.0. Sibling of the lrSv chirmade101Sv
structural-variant track, which is built from the same 101 samples.
8,814,835 autosomal + chrX sites. Source release ships only AF; AC and
AN are synthesized in the emitted VCF as AC=round(AF*202) and AN=202
(2*101 diploid), with the gnomAD v3.1 non-Finnish European AF and dbSNP
rsID passed through as GNOMAD_NFE_AF and RSID info fields. VCF is
bgzipped + tabix-indexed (172 MB + 1.6 MB .tbi).
Files:
- scripts/varFreqs/svatalogFreqToVcf.py (new): per-chrom allele-freq
TSV -> single VCF with hg38 ##contig header
- trackDb/human/varFreqs.ra: new svatalogSnv vcfTabix subtrack
- trackDb/human/svatalogSnv.html (new): doc page
- trackDb/human/varFreqs.html: new row in Available Datasets table
- doc/hg38/varFreqs.txt: wget-free build block (input files were
downloaded manually from Zenodo 13367574)
Note: the All Databases Combined varFreqs bigBed has NOT been rebuilt
to include this new source yet; a subsequent merge pass will add it.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/tommoJpCnv.html
- lines changed 110, context: html, text, full: html, text
06a482a2120d4d85c7c34fb5038213e07f595554 Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/tommoJpSv.html
- lines changed 2, context: html, text, full: html, text
aaa2d4e074608d3e07ecf7d7a7e35cfa96b0a06d Tue Apr 21 08:29:35 2026 -0700
tommoJpSv: fix broken jMorp download link
The previous dataset-specific URL (tommo-jsv1-20211208-af) returned an
error. Point to the general jMorp downloads page instead, where users
can find the current ToMMo SV callset.
refs #36258
- lines changed 29, context: html, text, full: html, text
06a482a2120d4d85c7c34fb5038213e07f595554 Tue Apr 21 15:00:21 2026 -0700
lrSv: add tommoJpCnv short-read CNV comparator (multiWig)
ToMMo 48KJPN-CNV Frequency Panel: copy-number variation frequencies
from short-read whole-genome sequencing of 48,874 Japanese individuals
(jMorp 20230828 release, GATK CNV germline workflow at 1 kb
resolution). Published as a companion short-read comparator to the
long-read tommoJpSv track.
Rendered as a multiWig container with two bigWig subtracks (transparent
overlay): tommoJpCnvLoss.bw counts samples at CN<2 per bin (red) and
tommoJpCnvGain.bw counts samples at CN>2 per bin (green). Values are
absolute carrier counts out of 48,874. 2,006,905 bins with at least one
CNV carrier; bins that are wholly CN=2 are omitted.
Files:
- trackDb/human/lrSv.ra: new tommoJpCnv multiWig container
- trackDb/human/tommoJpCnv.html: new doc page
- trackDb/human/lrSv.html: summary-table row + per-track blurb
- scripts/lrSv/lrSvTommoJpCnvVcfToBedGraph.py: VCF -> two bedGraphs
- doc/hg38/lrSv.txt: wget, converter invocation, bigWig build steps
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 33, context: html, text, full: html, text
bac95a147f49cd331052e597006e04b3deee40fc Wed Apr 22 10:43:20 2026 -0700
lrSv/srSv: human-readable SV type filter labels, script cleanups
Add human-readable labels to the supertrack-level svType filter on
both the lrSv and srSv supertracks using the "CODE|CODE (Long name)"
filterValues syntax: DEL -> "DEL (Deletion)", INS -> "INS (Insertion)",
etc. Labels keep the short code up front so users can match what
hgTracks shows next to each feature.
Also sweep in the in-progress converter/as-file cleanups under
scripts/lrSv/ and scripts/srSv/ (introduction of lrSvCommon.py
helpers, consistent insLen / svLen / AC column naming, tightened
field-description text) that had been piling up as an unstaged
working tree.
refs #36258
- src/hg/makeDb/trackDb/human/trackDb.ra
- lines changed 1, context: html, text, full: html, text
f058c8fe4601b223ff47468eb3525c05ccd03850 Wed Apr 22 09:17:17 2026 -0700
srSv: new short-read SV supertrack, split out of lrSv
Move the three short-read SV/CNV subtracks (abelSv, onekg3202Sr,
tommoJpCnv) out of the Long-read SV supertrack into a new sibling
supertrack srSv (Short-read SVs), so the lrSv collection contains
only long-read callsets. Filter fields (svType, svLen, insLen, AC)
are mirrored at the srSv supertrack level to keep the UX parallel
to lrSv.
- trackDb: new human/srSv.ra with the three subtrack stanzas and
updated /gbdb/$D/srSv/... bigDataUrls; corresponding stanzas
removed from human/lrSv.ra. human/trackDb.ra now includes
srSv.ra. Also a new human/srSv.html overview page; the SR rows
and SR-specific paragraphs removed from human/lrSv.html.
- Scripts: abelSv/{abelSv.as,vcfToBed.py,build.sh} and lrSv/
{lrSv1kg3202Sr*, lrSvTommoJpCnvVcfToBedGraph.py} moved to
scripts/srSv/ with git mv (history preserved) and renamed to
drop the "lrSv" prefix. Internal path references in abelSvBuild.sh
and abelSvVcfToBed.py updated.
- makeDoc: doc/hg38/abelSv.txt renamed to doc/hg38/srSv.txt and
extended with the onekg3202Sr and tommoJpCnv sections moved from
lrSv.txt. lrSv.txt leaves a pointer.
- Data: /hive/data/genomes/hg38/bed/{abelSv,lrSv/onekg3202sr,
lrSv/tommoJpCnv} moved to /hive/data/genomes/hg38/bed/srSv/*.
/gbdb/hg38/lrSv/{onekg3202sr.bb,tommoJpCnv{Loss,Gain}.bw} and
/gbdb/hg38/abelSv/ removed and re-linked under /gbdb/hg38/srSv/.
refs #36258
- lines changed 1, context: html, text, full: html, text
875dc5de11d5e0eb51e94cb37d5b74da447a6c7b Fri Apr 24 06:17:18 2026 -0700
removing old nmdEscape track from tdb, no redmine
- src/hg/makeDb/trackDb/human/varFreqs.html
- lines changed 9, context: html, text, full: html, text
695f40f9d6139a4df393522c067f1702aff8d3bd Wed Apr 22 03:13:39 2026 -0700
varFreqs: add SVatalog 101 short-read SNV frequencies subtrack
SNV/indel allele frequencies from the 101-sample GWAS SVatalog cohort
(Chirmade et al. 2026, Heredity, PMID 41203876), called from 10X
Genomics linked short-read WGS with GATK HaplotypeCaller v4.0.0.0 and
phased with SHAPEIT v4.2.0. Sibling of the lrSv chirmade101Sv
structural-variant track, which is built from the same 101 samples.
8,814,835 autosomal + chrX sites. Source release ships only AF; AC and
AN are synthesized in the emitted VCF as AC=round(AF*202) and AN=202
(2*101 diploid), with the gnomAD v3.1 non-Finnish European AF and dbSNP
rsID passed through as GNOMAD_NFE_AF and RSID info fields. VCF is
bgzipped + tabix-indexed (172 MB + 1.6 MB .tbi).
Files:
- scripts/varFreqs/svatalogFreqToVcf.py (new): per-chrom allele-freq
TSV -> single VCF with hg38 ##contig header
- trackDb/human/varFreqs.ra: new svatalogSnv vcfTabix subtrack
- trackDb/human/svatalogSnv.html (new): doc page
- trackDb/human/varFreqs.html: new row in Available Datasets table
- doc/hg38/varFreqs.txt: wget-free build block (input files were
downloaded manually from Zenodo 13367574)
Note: the All Databases Combined varFreqs bigBed has NOT been rebuilt
to include this new source yet; a subsequent merge pass will add it.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/human/varFreqs.ra
- lines changed 21, context: html, text, full: html, text
f4d6633d6a724d7b682f9f49ed983e22a5e0975d Mon Apr 20 14:41:07 2026 -0700
updating a few lrSv subtracks, and moving the colorsDbSnv track under
the varFreqs track. refs #36642
- lines changed 21, context: html, text, full: html, text
a1f009b6019876a1bc845c6f0892b7bc942e368e Mon Apr 20 14:43:29 2026 -0700
Hiding regeneron subtracks of varFreqs track again, no permission to distribute these.
- lines changed 9, context: html, text, full: html, text
366afa4a74c46ec6fb2b667a2902a873feec40cf Mon Apr 20 23:00:05 2026 -0700
varFreqsAll: rebuild combined bigBed to include GA4K and CoLoRSdb
Regenerate the All Databases Combined track with the two long-read
PacBio subtracks (GA4K 552 samples and CoLoRSdb v1.2.0 1,027 samples)
that were added to varFreqs since the March build. Source count rises
from 21 to 23 databases; final bigBed is 37.7 GB with 1.17B records
and 113 fields. Updates varFreqs.ra filterValues.sources and per-
database AF/AC filters for the two new sources, and databases.tsv
+ varFreqs.txt (build notes).
refs #36642
- lines changed 9, context: html, text, full: html, text
695f40f9d6139a4df393522c067f1702aff8d3bd Wed Apr 22 03:13:39 2026 -0700
varFreqs: add SVatalog 101 short-read SNV frequencies subtrack
SNV/indel allele frequencies from the 101-sample GWAS SVatalog cohort
(Chirmade et al. 2026, Heredity, PMID 41203876), called from 10X
Genomics linked short-read WGS with GATK HaplotypeCaller v4.0.0.0 and
phased with SHAPEIT v4.2.0. Sibling of the lrSv chirmade101Sv
structural-variant track, which is built from the same 101 samples.
8,814,835 autosomal + chrX sites. Source release ships only AF; AC and
AN are synthesized in the emitted VCF as AC=round(AF*202) and AN=202
(2*101 diploid), with the gnomAD v3.1 non-Finnish European AF and dbSNP
rsID passed through as GNOMAD_NFE_AF and RSID info fields. VCF is
bgzipped + tabix-indexed (172 MB + 1.6 MB .tbi).
Files:
- scripts/varFreqs/svatalogFreqToVcf.py (new): per-chrom allele-freq
TSV -> single VCF with hg38 ##contig header
- trackDb/human/varFreqs.ra: new svatalogSnv vcfTabix subtrack
- trackDb/human/svatalogSnv.html (new): doc page
- trackDb/human/varFreqs.html: new row in Available Datasets table
- doc/hg38/varFreqs.txt: wget-free build block (input files were
downloaded manually from Zenodo 13367574)
Note: the All Databases Combined varFreqs bigBed has NOT been rebuilt
to include this new source yet; a subsequent merge pass will add it.
refs #36258
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/mouse/mm10/cCREregistry.html
- lines changed 1, context: html, text, full: html, text
52ebb985f6e3658f463e971b7acd4c3959d36f92 Wed Apr 22 17:38:06 2026 -0700
Enlarged classification diagram (40% to 60%) on the cCRE registry
and Core Collection description pages. Updated the shortLabels on
all 90 Core Collection subtracks with biosample + age + assay
labels. refs #37131
- src/hg/makeDb/trackDb/mouse/mm10/coreCollection.html
- lines changed 1, context: html, text, full: html, text
52ebb985f6e3658f463e971b7acd4c3959d36f92 Wed Apr 22 17:38:06 2026 -0700
Enlarged classification diagram (40% to 60%) on the cCRE registry
and Core Collection description pages. Updated the shortLabels on
all 90 Core Collection subtracks with biosample + age + assay
labels. refs #37131
- lines changed 8, context: html, text, full: html, text
563389a6e1bce94aafe4ec991c064e3d204670ff Wed Apr 22 18:16:44 2026 -0700
Added signal track color descriptions (DNase green, H3K4me3 red,
H3K27ac yellow, CTCF blue) to the mm10 Core Collection description
page, with a note that the signal colors are independent from the
cCRE classification colors. refs #37131
- src/hg/makeDb/trackDb/mouse/mm10/developmentTimecourseM21mm10FPKM.html
- lines changed 66, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm10/developmentTimecourseM21mm10TPM.html
- lines changed 66, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm10/developmentTimecourseM4mm10FPKM.html
- lines changed 66, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm10/developmentTimecourseM4mm10TPM.html
- lines changed 66, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm10/encode4.ccres.ra
- lines changed 90, context: html, text, full: html, text
52ebb985f6e3658f463e971b7acd4c3959d36f92 Wed Apr 22 17:38:06 2026 -0700
Enlarged classification diagram (40% to 60%) on the cCRE registry
and Core Collection description pages. Updated the shortLabels on
all 90 Core Collection subtracks with biosample + age + assay
labels. refs #37131
- src/hg/makeDb/trackDb/mouse/mm10/encode4RegEpigenetics.ra
- lines changed 1, context: html, text, full: html, text
8b86391ff90bd5b416b1650c4fdcf8199a7e9b3f Fri Apr 24 09:02:39 2026 -0700
Fixing faceted subtrackUrls labels for mm10 ENCODE 4, refs #36320
- src/hg/makeDb/trackDb/mouse/mm10/encode4RegRnaSeq.ra
- lines changed 1, context: html, text, full: html, text
8b86391ff90bd5b416b1650c4fdcf8199a7e9b3f Fri Apr 24 09:02:39 2026 -0700
Fixing faceted subtrackUrls labels for mm10 ENCODE 4, refs #36320
- src/hg/makeDb/trackDb/mouse/mm10/encode4RegTfChip.ra
- lines changed 1, context: html, text, full: html, text
8b86391ff90bd5b416b1650c4fdcf8199a7e9b3f Fri Apr 24 09:02:39 2026 -0700
Fixing faceted subtrackUrls labels for mm10 ENCODE 4, refs #36320
- src/hg/makeDb/trackDb/mouse/mm10/mouseDevTimecourse.html
- lines changed 67, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm39/developmentTimecourseM21mm39FPKM.html
- lines changed 66, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm39/developmentTimecourseM21mm39TPM.html
- lines changed 66, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm39/mouseDevTimecourse.html
- lines changed 67, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouse/mm39/trackDb.gencode.ra
- lines changed 1, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/mouse/mm39/wgEncodeGencodeVM39.html
- lines changed 30, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/mouse/mm39/wgEncodeGencodeVM39.ra
- lines changed 247, context: html, text, full: html, text
b6aee4c6471cddebd638fec8dbb988c29a69bc22 Thu Apr 23 21:58:41 2026 -0700
import of GENCODE V50, MV39, and V50lift37; added a command to do import with a single command
- src/hg/makeDb/trackDb/mouseDevTimecourseCreditsAndReferences.shared.html
- lines changed 34, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/mouseDevTimecourseDisplay.shared.html
- lines changed 32, context: html, text, full: html, text
26c4835e47a40f162ae82c0ea20e1997f42bc4ad Tue Apr 21 16:35:37 2026 -0700
Moving shared HTML content in mouseDevTimecourse track description pages into shared include files using Claude. refs #37414
- src/hg/makeDb/trackDb/relatedTracks.ra
- lines changed 15, context: html, text, full: html, text
50466766840ded6cb8bd5cb868bdf2ff3f613bc0 Tue Apr 21 11:17:15 2026 -0700
QA fixes for PrimateAI-3D track.
Config (primateAi.ra):
- Fix broken Ensembl transcript linkout: urls $S expanded to chromosome
name; switch to the Ensembl transcript page with $$
- Add numeric filters on percentile and raw score (label notes the
paper's 0.821 clinical threshold)
- Add maxWindowToDraw 2000000
Data (primateAiToBigBed.py):
- Change hardcoded strand '+' to '.': the source file has no strand
column
- Accept input/output paths as CLI args (previously hardcoded the hg38
input path)
- Handle variable field count: ~2.4M rows in the hg19 source are
missing the refseq column
Description (primateAi.html):
- Fix two broken hgTrackUi&... internal links to the Zoonomia 447-way
track
- Regenerate the first reference via getTrackReferences (wrong article
number and wrong PMC ID in the previous text)
- Fix the GitHub URL for the conversion script in Methods
- Move the Zoonomia 447-way mention out of Description; rephrase the
license note to describe precisely what is disabled
relatedTracks.ra:
- Add reciprocal cross-links for primateAi <-> alphaMissense (hg38),
primateAi <-> revel (hg38 + hg19), and primateAi <-> promoterAi
(hg38). Also includes promoterAi <-> alphaMissense cross-links.
refs #37274 #37279
- lines changed 6, context: html, text, full: html, text
4bd316f5f1ca47328bd3f9a181214b788055f0bc Tue Apr 21 13:29:26 2026 -0700
NMD Escape QA round 3: switch RefSeq to curated, fix Rule 2 misclassification. refs #33737
Switched the NMD Escape RefSeq subtrack input from hg38.ncbiRefSeq.txt.gz (all)
to hg38.ncbiRefSeqCurated.txt.gz (NM_/NR_ only, no XM_/XR_ predicted models)
per Max's feedback. longLabel updated to "NCBI RefSeq Curated transcripts".
Fixed Rule 2 in genePredNmdEsc to test rec["exonCount"]==1 instead of
len(cdsExons)==1. The old test misclassified multi-exon transcripts with a
single CDS exon (UTR introns) as "intronless" and silently suppressed their
Rule 1/3/4 assignments via the if/else short-circuit. 3,253 RefSeq curated
and ~2,000 Gencode transcripts reassigned from Rule 2 to Rules 1/3. Rebuilt
both tracks.
Added Rule 1 caveat to nmdEscTranscripts.html for transcripts with a
penultimate coding exon shorter than 50 bp.
Added reciprocal relatedTracks.ra entries for nmd <-> mane and nmd <-> ncbiRefSeq.
QA cleanups: non-ASCII prime char replaced with ′, mailing list links
given target="_blank" across all three HTML pages, dead commented nmdGencode
block removed from nmd.ra, AutoSQL field comments updated to cover Rule 4
color and the gene-symbol-to-transcript-ID fallback.
Makedoc updated with the full Gencode + RefSeq pipeline and /gbdb symlinks.
- lines changed 6, context: html, text, full: html, text
888e7470c14eeecdca310ed36bb45c3c00ae8052 Tue Apr 21 15:14:04 2026 -0700
QA fixes for MPRA superTrack. refs #37359
Fix broken mpraVarDb bigDataUrl — pointed at /gbdb/hg38/mpra/mpravardb.bb
but the file is at /gbdb/hg38/mpra/mpravardb/mpravardb.bb, causing
hgTrackDb -strict to silently drop the subtrack.
Rebuild mpravardb.bb after two fixes in mpravardbToBed.py: sanitize UTF-8
in user-visible string fields (curly quotes, primes, NBSP mojibake) that
the browser does not transcode, eliminating ~246k non-ASCII occurrences
across 42% of rows; and change safe_float / pval_to_score to write NaN
and return score 0 for NA / out-of-range p-values instead of 0.0 and
score 1000 (previously inflated untested variants to the top of
score-sorted views).
trackDb stanza cleanup: shorten mpraVarDb longLabel, drop superfluous
type bed 4 from superTrack, make bigBed 9+13 explicit, remove redundant
mouseOverField, align parent mpra on, add filterValues for
cell_line/assay/cellLine and filterByRange sliders for percentile_rank /
fdr / log2FC, add labelFields and maxWindowToDraw.
Description pages: add cross-species disclosure (mouse reporter cells
used to assay human sequences), update mpraVarDb header to post-liftOver
count 239,028 with Studies-table footnote, fix mpraVarDb.html
download-server paths, soften imprecise "51 MPRA experiments" claim in
mpra.html and mprabase.html.
relatedTracks.ra: reciprocal mpra <-> wgEncodeReg4 and mpra <-> cCREs.
Expand mpra.txt makedoc with upstream provenance and QA-rebuild log.
- src/hg/makeDb/trackDb/tagTypes.tab
- lines changed 1, context: html, text, full: html, text
689349fca5a4865a1891db8cd39d392657b2b09b Wed Apr 22 02:54:09 2026 -0700
Replacing the subtrackUrl setting for faceted composites with subtrackUrls,
which supports outlinks in multiple fields. refs #36320
- src/hg/makeDb/trackDb/zebrafish/danRer11/choriCloneEnds.html
- lines changed 142, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/zebrafish/danRer11/choriCloneEnds.ra
- lines changed 87, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/zebrafish/danRer11/ncbiCloneEndsCH1073.html
- lines changed 123, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 123, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/zebrafish/danRer11/ncbiCloneEndsCH1073.ra
- lines changed 20, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 20, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/makeDb/trackDb/zebrafish/danRer11/trackDb.ra
- lines changed 2, context: html, text, full: html, text
8faeb3cba60c7cb842bc17c17a57c9b53ef1b478 Tue Apr 21 02:51:32 2026 -0700
ncbiCloneEndsCH1073: add NCBI CH1073 BAC library clone end placements track on danRer11, refs #35059
210,777 unique-concordant clone-insert placements from NCBI's CH1073
(RZPD-1073 / DanioKey) library clone report. Separate from the existing
bacEndPairsLift (danRer4 -> danRer11 UCSC-BLAT lift), which is left in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 1, context: html, text, full: html, text
d93c426ef1ad5fbb32b754408599eaf380a199e5 Tue Apr 21 13:34:58 2026 -0700
choriCloneEnds: reorganize danRer11 CHORI BAC clone end placements as a superTrack, refs #35059
- Rename ncbiCloneEndsCH1073 to choriCloneEnds throughout (trackDb, HTML,
makeDoc, scripts dir, /hive and /gbdb layout). User-visible label is
now "CHORI Clones" since all three libraries (CH1073, CH73, CH211) are
CHORI/BACPAC BAC libraries; data source (NCBI Clone DB) is cited in
Methods.
- Wrap the existing CH1073 track in a choriCloneEnds superTrack and
add two new subtracks built from the parallel unique_concordant GFFs
at ftp.ncbi.nih.gov/repository/clone/reports/Danio_rerio/ :
CH73 (99,141 placements, 23 oversize)
CH211 (70,231 placements, 46 oversize)
CH1073 is rebuilt with the same pipeline (210,777 placements).
- Build all three bigBeds with -extraIndex=name and register
searchTable / searchType bigBed stanzas with searchIndex name on each
subtrack, so clone names (CH1073-100A1, CH73-1A1, CH211-1A1, ...)
resolve from the Genome Browser position box.
- Single shared HTML description page; Methods now links to the NCBI
FTP source and to the UCSC makeDoc and scripts dir on GitHub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/hg/utils/automation/kegAlign.json.ga
- lines changed 170, context: html, text, full: html, text
46668bb54c3c8a7a57bda515be6881edb2382563 Wed Apr 22 14:55:23 2026 -0700
correctly connecting scoringMatrix input to kegAlign scores refs #31811
- src/hg/utils/mafToBigMafSummary/mafToBigMafSummary.c
- lines changed 273, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/mafToBigMafSummary/makefile
- lines changed 3, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/mafToBigMafSummary/tests/expected/testDot.bed
- lines changed 17, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/mafToBigMafSummary/tests/expected/testPipe.bed
- lines changed 17, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/mafToBigMafSummary/tests/input/testDot.maf
- lines changed 74, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/mafToBigMafSummary/tests/input/testPipe.maf
- lines changed 74, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/mafToBigMafSummary/tests/makefile
- lines changed 20, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/makefile
- lines changed 1, context: html, text, full: html, text
cc610239716fe32f9c774d98a71f75e8c6b5fba3 Tue Apr 21 17:23:50 2026 -0700
mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404
- src/hg/utils/otto/genCC/doGenCC.py
- lines changed 8, context: html, text, full: html, text
7bf5a5c3241a66ba15f9763638a04d072f4b9b76 Fri Apr 24 12:22:11 2026 -0700
Moving the 'if args.force' condition to the begining of the function since the md5sum checks are not needed if we are going to run the update manually. Addressing feedback from code review, refs #37417
- src/hg/utils/otto/otto.crontab
- lines changed 4, context: html, text, full: html, text
1915cefe1c6dd5746ee2c3e8a9ba9cfd4ad4e3b5 Tue Apr 21 14:45:24 2026 -0700
turning on ottoRequest watch script refs #31811
- lines changed 2, context: html, text, full: html, text
f841658912e8c8441e745b2bfdbb86b7c270d811 Tue Apr 21 14:52:59 2026 -0700
run ottoRequest.py every minute refs #31811
- src/hg/utils/otto/ottoCompareGitVsHiveFiles.py
- lines changed 40, context: html, text, full: html, text
43cdea8bbdc9cc2a3f84eb908763a86e88872afc Tue Apr 21 15:19:17 2026 -0700
Updating the otto compare script to also print the timestamp for the file on git and hive. This way we can determine which source needs to be updated, no Redmine.
- src/hg/utils/otto/userRequests/README.txt
- lines changed 29, context: html, text, full: html, text
584b69eff038b10c2176d2b9299c356af21288cc Wed Apr 22 16:30:53 2026 -0700
initial scripts to run the galaxy workflow based off of the ottoRequest table entries refs #31811
- lines changed 68, context: html, text, full: html, text
d8ade6071b7e59f93d5b703b3c024b345871dce1 Fri Apr 24 08:55:58 2026 -0700
becoming complete process refs #31811
- lines changed 14, context: html, text, full: html, text
f5cece60b38bd8c2cfd57acf2abb972d66edb6d9 Fri Apr 24 11:43:36 2026 -0700
rename doneStatus to just status and correctly set status codes refs #31811
- src/hg/utils/otto/userRequests/dbDb.name.clade.tsv
- lines changed 239, context: html, text, full: html, text
584b69eff038b10c2176d2b9299c356af21288cc Wed Apr 22 16:30:53 2026 -0700
initial scripts to run the galaxy workflow based off of the ottoRequest table entries refs #31811
- src/hg/utils/otto/userRequests/ottoRequest.py
- lines changed 149, context: html, text, full: html, text
fcfb91470c294d3b991dbd0d24cba67de9ed65cb Tue Apr 21 14:20:57 2026 -0700
create ottoRequest cron watch script refs #31811
- lines changed 1, context: html, text, full: html, text
678618e8499dc5e281ed6a1fe18da71e43fa4b3a Wed Apr 22 14:39:31 2026 -0700
Commenting out a line in the script to match the version running on hive, no Redmine.
- lines changed 41, context: html, text, full: html, text
398e258ed1330421291f943298e7f801f7b05bf8 Thu Apr 23 15:05:02 2026 -0700
and send notification email to the requesting user refs #31811
- lines changed 2, context: html, text, full: html, text
f5cece60b38bd8c2cfd57acf2abb972d66edb6d9 Fri Apr 24 11:43:36 2026 -0700
rename doneStatus to just status and correctly set status codes refs #31811
- src/hg/utils/otto/userRequests/ottoRequestAlign.sh
- lines changed 146, context: html, text, full: html, text
584b69eff038b10c2176d2b9299c356af21288cc Wed Apr 22 16:30:53 2026 -0700
initial scripts to run the galaxy workflow based off of the ottoRequest table entries refs #31811
- lines changed 99, context: html, text, full: html, text
f0b8830616972bad195079b7ce775c919ff1e25f Thu Apr 23 15:13:50 2026 -0700
correctly decide which is going to be target and query based on the N50 sizes of the two given assemblies refs #31811
- lines changed 138, context: html, text, full: html, text
f5cece60b38bd8c2cfd57acf2abb972d66edb6d9 Fri Apr 24 11:43:36 2026 -0700
rename doneStatus to just status and correctly set status codes refs #31811
- src/hg/utils/otto/userRequests/ottoRequestWatch.sh
- lines changed 66, context: html, text, full: html, text
3bb21ba2c8ac7f60effdb9c311ff27e4016f8e57 Thu Apr 23 15:14:35 2026 -0700
script to use in hiram cron tab to watch for new requests and get the galaxy workflow running refs #31811
- lines changed 27, context: html, text, full: html, text
f5cece60b38bd8c2cfd57acf2abb972d66edb6d9 Fri Apr 24 11:43:36 2026 -0700
rename doneStatus to just status and correctly set status codes refs #31811
- src/hg/utils/otto/userRequests/workflowMonitor.sh
- lines changed 258, context: html, text, full: html, text
584b69eff038b10c2176d2b9299c356af21288cc Wed Apr 22 16:30:53 2026 -0700
initial scripts to run the galaxy workflow based off of the ottoRequest table entries refs #31811
- lines changed 36, context: html, text, full: html, text
6d8144be131a358037daea03610b22315527f26f Thu Apr 23 15:12:32 2026 -0700
better recognition of errors in the workflow refs #31811
- lines changed 24, context: html, text, full: html, text
f5cece60b38bd8c2cfd57acf2abb972d66edb6d9 Fri Apr 24 11:43:36 2026 -0700
rename doneStatus to just status and correctly set status codes refs #31811
- src/lib/cheapcgi.c
- lines changed 57, context: html, text, full: html, text
683eb89abf89e45cb554c489382803442e93bc0b Sun Apr 26 19:22:03 2026 -0700
Adding restrictions on the size of cart content we accept in one request, refs #37452
- src/utils/qa/hubPublicAutoUpdate
- lines changed 1, context: html, text, full: html, text
46ad8ee25abcc7c81cd3005972b9069f2f14b5c5 Thu Apr 23 12:09:16 2026 -0700
Updating hubPublicMail and hubPublicAutoUpdate crons to correctly detect
and contact authors of broken public hubs. hubPublicMail now uses curl
instead of python-requests (handles Cloudflare bot-blocks and 4xx/5xx
responses that requests silently treats as successful), falls back to
hubPublic.email as a secondary source so newly-added broken hubs still
accumulate failCount, and strips the mailto: prefix in parseEmail.
hubPublicAutoUpdate now escapes double quotes on email values. No RM
- src/utils/qa/hubPublicMail
- lines changed 56, context: html, text, full: html, text
46ad8ee25abcc7c81cd3005972b9069f2f14b5c5 Thu Apr 23 12:09:16 2026 -0700
Updating hubPublicMail and hubPublicAutoUpdate crons to correctly detect
and contact authors of broken public hubs. hubPublicMail now uses curl
instead of python-requests (handles Cloudflare bot-blocks and 4xx/5xx
responses that requests silently treats as successful), falls back to
hubPublic.email as a secondary source so newly-added broken hubs still
accumulate failCount, and strips the mailto: prefix in parseEmail.
hubPublicAutoUpdate now escapes double quotes on email values. No RM
- src/utils/qa/makefile
- lines changed 1, context: html, text, full: html, text
0f457bede2db45e9b4e91de2ee2d54502b91bcc4 Fri Apr 24 13:26:53 2026 -0700
Adding the updateHubSpaceQuota script to the makedoc, no Redmine
- src/utils/qa/quickLiftBench/README.md
- lines changed 147, context: html, text, full: html, text
ff2dc690270f4e155d6345af821bc8a2c197e667 Fri Apr 24 17:58:03 2026 -0700
quickLiftBench: hgTracks render-time benchmark for quickLift, refs #37445
Drives YAML-configured cases against hgTracks with measureTiming=1, parses
per-track loadTime/drawTime out of printTrackTiming() output, writes per-
iteration TSV plus a per-(case, position) summary with median/p90 and
lifted/native ratios. Supports same-hub source-vs-dest, track-pair across
assemblies, and lift-on/off side-by-side hub comparisons. Output backs the
benchmark numbers for the quickLift Bioinformatics paper (#36829).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/utils/qa/quickLiftBench/cases.yaml
- lines changed 106, context: html, text, full: html, text
ff2dc690270f4e155d6345af821bc8a2c197e667 Fri Apr 24 17:58:03 2026 -0700
quickLiftBench: hgTracks render-time benchmark for quickLift, refs #37445
Drives YAML-configured cases against hgTracks with measureTiming=1, parses
per-track loadTime/drawTime out of printTrackTiming() output, writes per-
iteration TSV plus a per-(case, position) summary with median/p90 and
lifted/native ratios. Supports same-hub source-vs-dest, track-pair across
assemblies, and lift-on/off side-by-side hub comparisons. Output backs the
benchmark numbers for the quickLift Bioinformatics paper (#36829).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/utils/qa/quickLiftBench/quickLiftBench.py
- lines changed 510, context: html, text, full: html, text
ff2dc690270f4e155d6345af821bc8a2c197e667 Fri Apr 24 17:58:03 2026 -0700
quickLiftBench: hgTracks render-time benchmark for quickLift, refs #37445
Drives YAML-configured cases against hgTracks with measureTiming=1, parses
per-track loadTime/drawTime out of printTrackTiming() output, writes per-
iteration TSV plus a per-(case, position) summary with median/p90 and
lifted/native ratios. Supports same-hub source-vs-dest, track-pair across
assemblies, and lift-on/off side-by-side hub comparisons. Output backs the
benchmark numbers for the quickLift Bioinformatics paper (#36829).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/utils/qa/updateHubSpaceQuota.sh
- lines changed 79, context: html, text, full: html, text
1e5a175df5ea8bc2d6b20d83f02bb3194e7a8afe Fri Apr 24 13:14:28 2026 -0700
Adding a simple script to update the Hub Space quota, refs #37425
- src/utils/redmineCli
- lines changed 22, context: html, text, full: html, text
07c109ea4fe627879544dddf222650ab8c29eaf8 Tue Apr 21 02:55:53 2026 -0700
redmineCli: add --file-list-add PATH option to append paths idempotently, refs #35059
Existing --file-list overwrites the custom field. The new --file-list-add
reads the current value, appends each path that isn't already listed, and
writes the combined value back with CRLF separators (matching Redmine's
internal format).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 8, context: html, text, full: html, text
993da626132958795cab63a9b26d64ce2052f40d Tue Apr 21 16:51:13 2026 -0700
Make redmineCli prepend_attribution idempotent. refs #37339
Skip adding the '**From Claude:**' header if the body already begins
with a From Claude attribution line (any bold/italic asterisk variant,
case-insensitive). Fixes the periodic doubled header when Claude models
mimic prior journal entries that already carried the prefix.
- lines changed 6, context: html, text, full: html, text
c5057c7b3735ac3688a7703f97ead937b3c7c0a6 Thu Apr 23 14:24:06 2026 -0700
redmineCli: let 'create' accept tracker and status by name
The existing resolve_tracker() / resolve_status() helpers already accepted
either a name ("To Do", "QA Ready") or a numeric ID, and the list / update
subcommands wired them up. The create subcommand declared --tracker and
--status as type=int, which rejected the name forms that the redmine skill
documents as working.
Drop the type=int, route args.tracker / args.status through the resolvers
in cmd_create, and update the help text. Numeric IDs continue to work
because the resolvers pass digit strings through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- lines changed 2, context: html, text, full: html, text
73169097327bcbfca570a98a34076c25ac069270 Fri Apr 24 08:08:52 2026 -0700
redmineCli: fix resolve_status crash when --status defaults to int
The default for create's --status is the int STATUS_NEW=1. Without
type=int on the argparse option, resolve_status received an int and
crashed on name_or_id.isdigit(). Wrap in str() to match resolve_tracker.
- lines changed: 25819
- files changed: 398