cc610239716fe32f9c774d98a71f75e8c6b5fba3 braney Tue Apr 21 17:23:50 2026 -0700 mafToBigMafSummary: new utility that emits bed3+4 input ready for bedToBigBed -as=mafSummary.as, replacing the hgLoadMafSummary -test / cut -f2- / sort hack documented in bigMaf.html. Companion to mafToBigMaf. The summary scoring/merging logic is intentionally duplicated from hgLoadMafSummary.c (see header comment) — that code is stable and refactoring would force retesting all the makedocs that call hgLoadMafSummary. Also fixes a bug in the duplicated copy of mafSplitSrcGetChrom: the original errAborts on plain 'hg38.chrY' style master src because of an inverted differentString check; rewritten with cleaner correct logic. Updates bigMaf.html and trackDb/trackDbLibrary.shtml to reference the new tool. refs #37404 diff --git src/hg/htdocs/goldenPath/help/bigMaf.html src/hg/htdocs/goldenPath/help/bigMaf.html index e1c365a9a59..849b5d3d516 100755 --- src/hg/htdocs/goldenPath/help/bigMaf.html +++ src/hg/htdocs/goldenPath/help/bigMaf.html @@ -56,31 +56,31 @@ <h6>mafSummary.as</h6> <pre><code>table mafSummary "Positions and scores for alignment blocks" ( string chrom; "Reference sequence chromosome or scaffold" uint chromStart; "Start position in chromosome" uint chromEnd; "End position in chromosome" string src; "Sequence name or database of alignment" float score; "Floating point score." char[1] leftStatus; "Gap/break annotation for preceding block" char[1] rightStatus; "Gap/break annotation for following block" )</code></pre> <p> An example, <code>bedToBigBed -type=bed3+4 -as=mafSummary.as -tab bigMafSummary.bed hg38.chrom.sizes bigMafSummary.bb</code>. -Another tool, <code>hgLoadMafSummary</code> generates the input +Another tool, <code>mafToBigMafSummary</code> generates the input <code>bigMafSummary.bed</code> file.</p> <p> The following autoSql definition is used to create the second file, pointed to online with <code>frames <url></code>. The file <a href="examples/mafFrames.as"><em>mafFrames.as</em></a>, is pulled in when the <code>bedToBigBed</code> utility is run with the <code>-as=mafFrames.as</code> option.</p> <h6>mafFrames.as</h6> <pre><code>table mafFrames "codon frame assignment for MAF components" ( string chrom; "Reference sequence chromosome or scaffold" uint chromStart; "Start range in chromosome" uint chromEnd; "End range in chromosome" string src; "Name of sequence source in MAF" @@ -122,60 +122,59 @@ <a href="examples/mafFrames.as">mafFrames.as</a> files.</p> <p> Here are wget commands to obtain the above files and the hg38.chrom.sizes file mentioned below: <pre><code>wget https://genome.ucsc.edu/goldenPath/help/examples/chr22_KI270731v1_random.maf wget https://genome.ucsc.edu/goldenPath/help/examples/chr22_KI270731v1_random.gp wget https://genome.ucsc.edu/goldenPath/help/examples/bigMaf.as wget https://genome.ucsc.edu/goldenPath/help/examples/mafSummary.as wget https://genome.ucsc.edu/goldenPath/help/examples/mafFrames.as wget http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes </code></pre> <p> <strong>Step 4.</strong> Download the <code>bedToBigBed</code> and <code>mafToBigMaf</code> programs from the UCSC <a href="http://hgdownload.gi.ucsc.edu/admin/exe/">binary utilities directory</a>. If you have opted to generate the optional frame and summary files for your multiple alignment, you must also -download the <code>hgLoadMafSummary</code>, <code>genePredSingleCover</code>, and +download the <code>mafToBigMafSummary</code>, <code>genePredSingleCover</code>, and <code>genePredToMafFrames</code> programs from the same <a href="http://hgdownload.gi.ucsc.edu/admin/exe/">directory</a>.</p> <p> <strong>Step 5.</strong> Use the <code>fetchChromSizes</code> script from the <a href="http://hgdownload.gi.ucsc.edu/admin/exe/">same directory</a> to create a <em>chrom.sizes</em> file for the UCSC database with which you are working (e.g., hg38). Alternatively, you can download the <em>chrom.sizes</em> file for any assembly hosted at UCSC from our <a href="http://hgdownload.gi.ucsc.edu/downloads.html">downloads</a> page (click on "Full data set" for any assembly). For example, the <em>hg38.chrom.sizes</em> file for the hg38 database is located at <a href="http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes" target="_blank">http://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes</a>.</p> <pre><code>mafToBigMaf hg38 chr22_KI270731v1_random.maf stdout | sort -k1,1 -k2,2n > bigMaf.txt bedToBigBed -type=bed3+1 -as=bigMaf.as -tab bigMaf.txt hg38.chrom.sizes bigMaf.bb </code></pre> <p>Note that the hg38 in the mafToBigMaf hg38 command indicates the referenceDb and matches the expected prefix of the primary species' sequence name, for instance hg38 for the hg38.chr22_KI270731v1_random found in the input example chr22_KI270731v1_random.maf file.</p> <p> <strong>Step 6.</strong> Follow the below steps to create the binary indexed mafFrames and mafSummary files to accompany your bigMaf file:</p> <pre><code>genePredSingleCover chr22_KI270731v1_random.gp single.gp genePredToMafFrames hg38 chr22_KI270731v1_random.maf bigMafFrames.txt hg38 single.gp bedToBigBed -type=bed3+8 -as=mafFrames.as -tab bigMafFrames.txt hg38.chrom.sizes bigMafFrames.bb -hgLoadMafSummary -minSeqSize=1 -test hg38 bigMafSummary chr22_KI270731v1_random.maf -cut -f2- bigMafSummary.tab | sort -k1,1 -k2,2n > bigMafSummary.bed +mafToBigMafSummary hg38 chr22_KI270731v1_random.maf stdout | sort -k1,1 -k2,2n > bigMafSummary.bed bedToBigBed -type=bed3+4 -as=mafSummary.as -tab bigMafSummary.bed hg38.chrom.sizes bigMafSummary.bb </code></pre> <p> <strong>Step 7.</strong> Move the newly created bigMaf file (<em>bigMaf.bb</em>) to a web-accessible http, https or ftp location. If you generated the <em>bigMafSummary.bb</em> and/or <em>bigMafFrames.bb</em> files, move those to a web accessible location, likely same location as the <em>bigMaf.bb</em> file.</p> <p> <strong>Step 8.</strong> Construct a <a href="hgTracksHelp.html#CustomTracks">custom track</a> using a single <a href="hgTracksHelp.html#TRACK">track line</a>. Note that any of the track attributes listed <a href="customTrack.html#TRACK">here</a> are applicable to tracks of type bigBed. The most basic version of the track line will look something like this:</p> <pre><code>track type=bigMaf name="My Big MAF" description="A Multiple Alignment" bigDataUrl=http://myorg.edu/mylab/bigMaf.bb summary=http://myorg.edu/mylab/bigMafSummary.bb frames=http://myorg.edu/mylab/bigMafFrames.bb</code></pre> <p> <strong>Step 9.</strong>