cc610239716fe32f9c774d98a71f75e8c6b5fba3 braney Tue Apr 21 17:23:50 2026 -0700 mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404 diff --git src/hg/htdocs/goldenPath/help/bigMaf.html src/hg/htdocs/goldenPath/help/bigMaf.html index e1c365a9a59..849b5d3d516 100755 --- src/hg/htdocs/goldenPath/help/bigMaf.html +++ src/hg/htdocs/goldenPath/help/bigMaf.html @@ -56,31 +56,31 @@
table mafSummary
"Positions and scores for alignment blocks"
(
string chrom; "Reference sequence chromosome or scaffold"
uint chromStart; "Start position in chromosome"
uint chromEnd; "End position in chromosome"
string src; "Sequence name or database of alignment"
float score; "Floating point score."
char[1] leftStatus; "Gap/break annotation for preceding block"
char[1] rightStatus; "Gap/break annotation for following block"
)
An example, bedToBigBed -type=bed3+4 -as=mafSummary.as
-tab bigMafSummary.bed hg38.chrom.sizes bigMafSummary.bb.
-Another tool, hgLoadMafSummary generates the input
+Another tool, mafToBigMafSummary generates the input
bigMafSummary.bed file.
The following autoSql definition is used to create the second file,
pointed to online with frames <url>. The file
mafFrames.as, is pulled in when
the bedToBigBed utility is run with the -as=mafFrames.as
option.
table mafFrames
"codon frame assignment for MAF components"
(
string chrom; "Reference sequence chromosome or scaffold"
uint chromStart; "Start range in chromosome"
uint chromEnd; "End range in chromosome"
string src; "Name of sequence source in MAF"
@@ -122,60 +122,59 @@
mafFrames.as files.
Here are wget commands to obtain the above files and the hg38.chrom.sizes file mentioned below:
wget https://genome.ucsc.edu/goldenPath/help/examples/chr22_KI270731v1_random.maf
wget https://genome.ucsc.edu/goldenPath/help/examples/chr22_KI270731v1_random.gp
wget https://genome.ucsc.edu/goldenPath/help/examples/bigMaf.as
wget https://genome.ucsc.edu/goldenPath/help/examples/mafSummary.as
wget https://genome.ucsc.edu/goldenPath/help/examples/mafFrames.as
wget http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
Step 4.
Download the bedToBigBed and mafToBigMaf programs from the UCSC
binary utilities directory. If you have
opted to generate the optional frame and summary files for your multiple alignment, you must also
-download the hgLoadMafSummary, genePredSingleCover, and
+download the mafToBigMafSummary, genePredSingleCover, and
genePredToMafFrames programs from the same
directory.
Step 5.
Use the fetchChromSizes script from the
same directory to create a
chrom.sizes file for the UCSC database with which you are working (e.g., hg38).
Alternatively, you can download the
chrom.sizes file for any assembly hosted at UCSC from our
downloads page (click on "Full
data set" for any assembly). For example, the hg38.chrom.sizes file for the hg38
database is located at
http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes.
mafToBigMaf hg38 chr22_KI270731v1_random.maf stdout | sort -k1,1 -k2,2n > bigMaf.txt
bedToBigBed -type=bed3+1 -as=bigMaf.as -tab bigMaf.txt hg38.chrom.sizes bigMaf.bb
Note that the hg38 in the mafToBigMaf hg38 command indicates the referenceDb and matches the
expected prefix of the primary species' sequence name, for instance hg38 for the
hg38.chr22_KI270731v1_random found in the input example chr22_KI270731v1_random.maf file.
Step 6.
Follow the below steps to create the binary indexed mafFrames and mafSummary files to accompany
your bigMaf file:
genePredSingleCover chr22_KI270731v1_random.gp single.gp
genePredToMafFrames hg38 chr22_KI270731v1_random.maf bigMafFrames.txt hg38 single.gp
bedToBigBed -type=bed3+8 -as=mafFrames.as -tab bigMafFrames.txt hg38.chrom.sizes bigMafFrames.bb
-hgLoadMafSummary -minSeqSize=1 -test hg38 bigMafSummary chr22_KI270731v1_random.maf
-cut -f2- bigMafSummary.tab | sort -k1,1 -k2,2n > bigMafSummary.bed
+mafToBigMafSummary hg38 chr22_KI270731v1_random.maf stdout | sort -k1,1 -k2,2n > bigMafSummary.bed
bedToBigBed -type=bed3+4 -as=mafSummary.as -tab bigMafSummary.bed hg38.chrom.sizes bigMafSummary.bb
Step 7.
Move the newly created bigMaf file (bigMaf.bb) to a web-accessible http, https or ftp
location. If you generated the bigMafSummary.bb and/or bigMafFrames.bb files,
move those to a web accessible location, likely same location as the bigMaf.bb file.
Step 8.
Construct a custom track using a single
track line. Note that any of the track attributes listed
here are applicable to tracks of type bigBed. The most basic
version of the track line will look something like this:
track type=bigMaf name="My Big MAF" description="A Multiple Alignment" bigDataUrl=http://myorg.edu/mylab/bigMaf.bb summary=http://myorg.edu/mylab/bigMafSummary.bb frames=http://myorg.edu/mylab/bigMafFrames.bb
Step 9.