7a173a092486bb744c3f42d694aab108ecae34d5 dschmelt Wed Apr 10 12:21:42 2019 -0700 Adding GTF to bigGenePred Example4 #21582 diff --git src/hg/htdocs/goldenPath/help/bigGenePred.html src/hg/htdocs/goldenPath/help/bigGenePred.html index 8e1c758..5cf8820 100755 --- src/hg/htdocs/goldenPath/help/bigGenePred.html +++ src/hg/htdocs/goldenPath/help/bigGenePred.html @@ -198,83 +198,82 @@ <li> Run the <code>bedToBigBed</code> utility to create the bigGenePred output file (<em>step 4</em>, above): <pre><code><B>bedToBigBed</B> -type=bed12+8 -tab -as=bigGenePred.as bigGenePred.txt hg38.chrom.sizes bigGenePred.bb</code></pre></li> <li> Place the newly created bigGenePred file (<em>bigGenePred.bb</em>) on a web-accessible server (<em>Step 5</em>, above).</li> <li> Construct a track line that points to the bigGenePred file (<em>Step 6</em>, above).</li> <li> Create the custom track on the human assembly hg38 (Dec. 2013), and view it in the Genome Browser (<em>step 7</em>, above).</li> </ol> <h3>Example #4</h3> -<p>In this example, you will convert a genePred file to bigGenePred using command line utilities. +<p>In this example, you will convert a GTF file to bigGenePred using command line utilities. You can download utilities from the <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">utilities directory</a>.</p> <ol> <li> - Obtain a genePred extended file. In this example, we are downloading the Comprehensive Gencode V28 gene data. - <pre><code>wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/wgEncodeGencodeCompV28.txt.gz</code></pre></li> + Obtain a GTF file using the wget command. Skip this step if you already have a GTF file. + <pre><code>wget http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredExample4.gtf</code></pre></li> <li> - Uncompress the file. - <pre><code>gunzip wgEncodeGencodeCompV28.txt.gz </code></pre></li> + Convert the GTF file to genePred extended format using the gtfToGenePred command. + <pre><code>gtfToGenePred -genePredExt bigGenePredExample4.gtf example4.genePred</code></pre></li> <li> -Isolate columns 2 till the end, removing the bin column, and saving as <em>wgCompV28Cut.txt</em>. - <pre><code>cut -f 2- wgEncodeGencodeCompV28.txt > wgCompV28Cut.txt </code></pre></li> + Convert the genePred extended file to a pre-bigGenePred text file. + <pre><code>genePredToBigGenePred example4.genePred ex4BigGenePred.txt</code></pre></li> <li> - Convert the genePred extended file to a bigGenePred text file, reordering and adding columns. - <pre><code>genePredToBigGenePred wgCompV28Cut.txt wgEncodeGencodeCompV28BigGP.txt</code></pre></li> - <li> - Obtain input files for the binary conversion. + Obtain helper files for the conversion from pre-bigGenePred to binary bigGenePred. <pre><code>fetchChromSizes hg38 > hg38.chrom.sizes -wget https://hgwdev.gi.ucsc.edu/goldenPath/help/examples/bigGenePred.as</code></pre></li> +wget http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.as</code></pre></li> <li> Convert your text bigGenePred to a binary indexed format. - <pre><code>bedToBigBed -type=bed12+8 -tab -as=bigGenePred.as wgEncodeGencodeCompV28BigGP.txt hg38.chrom.sizes wgEncodeGencodeCompV28.bgp</code></pre></li> + <pre><code>bedToBigBed -type=bed12+8 -tab -as=bigGenePred.as ex4BigGenePred.txt hg38.chrom.sizes ex4BigGenePred.bb</code></pre></li> <li> - Put your binary indexed file in a web-accessible location. See the <a href=”https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html#Hosting”>hosting section</a> for more information.</li> + Put your binary indexed file in a web-accessible location. See the + <a href=”https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html#Hosting”>hosting section</a> for more information.</li> <li> - View your dataset in the Browser by entering your data URL in the bigDataUrl field of the URL. - <pre><code>http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hgct_customText=track%20type=bigGenePred%20bigDataUrl=https://hgwdev.gi.ucsc.edu/~dschmelt/wgEncodeGencodeCompV28.bgp</code></pre> + View your dataset in the Browser by entering your hosted data URL in the bigDataUrl field of the + URL. For example, you can paste this link into your web browser. + <pre><code>http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hgct_customText=track%20type=bigGenePred%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/ex4bigGenePred.bb</code></pre> You can also add your data in the <a href="../../cgi-bin/hgCustom?db=hg38">custom track management page</a>. This allows you to set position, configuration options, and write a more complete -desciption. If you want to see codons, you will have to right click to configure codon view or -set this option using the <code>baseColorDefault=genomicCodons</code> code as is done below. +desciption. If you want to see codons, you can right click, then click configure codon view or +set these options using the <code>baseColorDefault=genomicCodons</code> code as is done below. <pre><code>browser position chr10:67,884,600-67,884,900 -track type=bigGenePred baseColorDefault=genomicCodons name="bigGenePred Example Four" description="BGP Made from genePred" visibility=pack bigDataUrl=https://hgwdev.gi.ucsc.edu/~dschmelt/wgEncodeGencodeCompV28.bgp</code></pre></li> +track type=bigGenePred baseColorDefault=genomicCodons name="bigGenePred Example Four" description="BGP Made from genePred" visibility=pack bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/ex4bigGenePred.bb</code></pre></li> </ol> <h2>Sharing your data with others</h2> <p> If you would like to share your bigGenePred data track with a colleague, learn how to create a URL link to your data by looking at <a href="customTrack.html#EXAMPLE6">Example #6</a>.</p> <h2>Extracting data from bigBed format</h2> <p> Because the bigGenePred files are an extension of bigBed files, which are indexed binary files, it can be difficult to extract data from them. UCSC has developed the following programs to assist in working with bigBed formats, available from the -<a href="http://hgdownload.soe.ucsc.edu/admin/exe/">binary utilities directory</a>.</p> -<ul> - <li> <code>bigBedToBed</code> — converts a bigBed file to ASCII BED format.</li> <li> <code>bigBedSummary</code> — extracts summary information from a bigBed file.</li> <li> <code>bigBedInfo</code> — prints out information about a bigBed file.</li> </ul> <p> As with all UCSC Genome Browser programs, simply type the program name (with no parameters) at the command line to view the usage statement.</p> <h2>Troubleshooting</h2> <p> If you encounter an error when you run the <code>bedToBigBed</code> program, check your input file for data coordinates that extend past the end of the chromosome. If these are present, run the <code>bedClip</code> program (<a href="http://hgdownload.soe.ucsc.edu/admin/exe/">available here</a>) to remove the problematic row(s) before running the <code>bedToBigBed</code> program. </p> <!--#include virtual="$ROOT/inc/gbPageEnd.html" --> +<!DOCTYPE html> +<!--#set var="TITLE" value="Genome Browser bigGenePred Track Format" --> +<!--#set var="ROOT" value="../.." -->