cc610239716fe32f9c774d98a71f75e8c6b5fba3 braney Tue Apr 21 17:23:50 2026 -0700 mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404 diff --git src/hg/htdocs/goldenPath/help/bigMaf.html src/hg/htdocs/goldenPath/help/bigMaf.html index e1c365a9a59..849b5d3d516 100755 --- src/hg/htdocs/goldenPath/help/bigMaf.html +++ src/hg/htdocs/goldenPath/help/bigMaf.html @@ -56,31 +56,31 @@
mafSummary.as
table mafSummary
 "Positions and scores for alignment blocks"
     (
     string chrom;      "Reference sequence chromosome or scaffold"
     uint   chromStart; "Start position in chromosome"
     uint   chromEnd;   "End position in chromosome"
     string src;        "Sequence name or database of alignment"
     float  score;      "Floating point score."
     char[1] leftStatus;  "Gap/break annotation for preceding block"
     char[1] rightStatus; "Gap/break annotation for following block"
     )

An example, bedToBigBed -type=bed3+4 -as=mafSummary.as -tab bigMafSummary.bed hg38.chrom.sizes bigMafSummary.bb. -Another tool, hgLoadMafSummary generates the input +Another tool, mafToBigMafSummary generates the input bigMafSummary.bed file.

The following autoSql definition is used to create the second file, pointed to online with frames <url>. The file mafFrames.as, is pulled in when the bedToBigBed utility is run with the -as=mafFrames.as option.

mafFrames.as
table mafFrames
 "codon frame assignment for MAF components"
     (
     string chrom;      "Reference sequence chromosome or scaffold"
     uint   chromStart; "Start range in chromosome"
     uint   chromEnd;   "End range in chromosome"
     string src;        "Name of sequence source in MAF"
@@ -122,60 +122,59 @@
 mafFrames.as files.

Here are wget commands to obtain the above files and the hg38.chrom.sizes file mentioned below:

wget https://genome.ucsc.edu/goldenPath/help/examples/chr22_KI270731v1_random.maf
 wget https://genome.ucsc.edu/goldenPath/help/examples/chr22_KI270731v1_random.gp
 wget https://genome.ucsc.edu/goldenPath/help/examples/bigMaf.as
 wget https://genome.ucsc.edu/goldenPath/help/examples/mafSummary.as
 wget https://genome.ucsc.edu/goldenPath/help/examples/mafFrames.as
 wget http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
 

Step 4. Download the bedToBigBed and mafToBigMaf programs from the UCSC binary utilities directory. If you have opted to generate the optional frame and summary files for your multiple alignment, you must also -download the hgLoadMafSummary, genePredSingleCover, and +download the mafToBigMafSummary, genePredSingleCover, and genePredToMafFrames programs from the same directory.

Step 5. Use the fetchChromSizes script from the same directory to create a chrom.sizes file for the UCSC database with which you are working (e.g., hg38). Alternatively, you can download the chrom.sizes file for any assembly hosted at UCSC from our downloads page (click on "Full data set" for any assembly). For example, the hg38.chrom.sizes file for the hg38 database is located at http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes.

mafToBigMaf hg38 chr22_KI270731v1_random.maf stdout | sort -k1,1 -k2,2n > bigMaf.txt
 bedToBigBed -type=bed3+1 -as=bigMaf.as -tab bigMaf.txt hg38.chrom.sizes bigMaf.bb 

Note that the hg38 in the mafToBigMaf hg38 command indicates the referenceDb and matches the expected prefix of the primary species' sequence name, for instance hg38 for the hg38.chr22_KI270731v1_random found in the input example chr22_KI270731v1_random.maf file.

Step 6. Follow the below steps to create the binary indexed mafFrames and mafSummary files to accompany your bigMaf file:

genePredSingleCover chr22_KI270731v1_random.gp single.gp
 genePredToMafFrames hg38 chr22_KI270731v1_random.maf bigMafFrames.txt hg38 single.gp
 bedToBigBed -type=bed3+8 -as=mafFrames.as -tab bigMafFrames.txt hg38.chrom.sizes bigMafFrames.bb
 
-hgLoadMafSummary -minSeqSize=1 -test hg38 bigMafSummary chr22_KI270731v1_random.maf
-cut -f2- bigMafSummary.tab | sort -k1,1 -k2,2n > bigMafSummary.bed
+mafToBigMafSummary hg38 chr22_KI270731v1_random.maf stdout | sort -k1,1 -k2,2n > bigMafSummary.bed
 bedToBigBed -type=bed3+4 -as=mafSummary.as -tab bigMafSummary.bed hg38.chrom.sizes bigMafSummary.bb 

Step 7. Move the newly created bigMaf file (bigMaf.bb) to a web-accessible http, https or ftp location. If you generated the bigMafSummary.bb and/or bigMafFrames.bb files, move those to a web accessible location, likely same location as the bigMaf.bb file.

Step 8. Construct a custom track using a single track line. Note that any of the track attributes listed here are applicable to tracks of type bigBed. The most basic version of the track line will look something like this:

track type=bigMaf name="My Big MAF" description="A Multiple Alignment" bigDataUrl=http://myorg.edu/mylab/bigMaf.bb summary=http://myorg.edu/mylab/bigMafSummary.bb frames=http://myorg.edu/mylab/bigMafFrames.bb

Step 9.