fe27dc07ef2406c8d42261c6931087f870363207
dschmelt
  Tue Apr 16 16:30:42 2019 -0700
Adding an image and proofreading bigGenePred.html

diff --git src/hg/htdocs/goldenPath/help/bigGenePred.html src/hg/htdocs/goldenPath/help/bigGenePred.html
index 276a79c..e372b0b 100755
--- src/hg/htdocs/goldenPath/help/bigGenePred.html
+++ src/hg/htdocs/goldenPath/help/bigGenePred.html
@@ -1,286 +1,286 @@
 <!DOCTYPE html>
 <!--#set var="TITLE" value="Genome Browser bigGenePred Track Format" -->
 <!--#set var="ROOT" value="../.." -->
 
 <!-- Relative paths to support mirror sites with non-standard GB docs install -->
 <!--#include virtual="$ROOT/inc/gbPageStart.html" -->
 
 <h1>bigGenePred Track Format</h1> 
 <p>
-The bigGenePred format stores positional annotation items for collections of exons in a compressed
-format, similar to how <a href="../../FAQ/FAQformat.html#format1">BED</a> files can be compressed
+The bigGenePred format stores positional annotations for collections of exons in a compressed
+format, similar to how <a href="../../FAQ/FAQformat.html#format1">BED</a> files are compressed
 into bigBeds. The bigGenePred format includes 8 additional fields that contain details about coding
 frames, annotation status, and other gene-specific information. This is commonly used in the Browser
 to display start codons, stop codons, and amino acid translations.</p>
 <p>
 Before compression, bigGenePred files can be described as bed12+8 files. bigGenePred
 files can be created using the program <code>bedToBigBed</code>, run with the <code>-as</code>
 option to pull in a special <a href="http://www.linuxjournal.com/article/5949" target="_blank">
 autoSql</a> (<em>.as</em>) file that defines the extra fields of the bigGenePred.</p> 
 <p>
 Much like bigBed, bigGenePred files are in an indexed binary format. The advantage of using a binary
 format is that only the portions of the file needed to display a particular region are read by the
 Genome Browser server. Because of this, indexed binary files have much faster display performance 
-than regular BED format files when working with large data sets. The bigGenePred file remains on 
-the user's web-accessible server (http, https or ftp) and only the portion 
-needed to display the current genome position is cached as a &quot;sparse file&quot;. If you want
-more information on finding a a web-accessible server or need hosting space for your
-bigGenePred files, please see the <a href="hgTrackHubHelp.html#Hosting">Hosting</a> section of the
-Track Hub Help documentation.</p>
+than regular BED format files when working with large data sets. As with all big files, bigGenePred
+files must be hosted on a web-accessible server (http, https, or ftp) to be displayed. For
+more information on finding a web-accessible location to host your bigGenePred files, please see
+the <a href="hgTrackHubHelp.html#Hosting">hosting section</a> of the Track Hub Help documentation.
+</p>
 
 <a name="bigGenePred"></a>
-<h2>bigGenePred file definition</h2>
+<h2>bigGenePred format description</h2>
 <p>
 The following autoSql definition specifies bigGenePred gene prediction files. This 
 definition, contained in the file <a href="examples/bigGenePred.as"><em>bigGenePred.as</em></a>, 
 is pulled in when the <code>bedToBigBed</code> utility is run with the 
 <code>-as=bigGenePred.as</code> option. 
 <pre><code>table bigGenePred
 "bigGenePred gene models"
     (
     string chrom;       	"Reference sequence chromosome or scaffold"
     uint   chromStart;  	"Start position in chromosome" 
     uint   chromEnd;    	"End position in chromosome"
     string name;        	"Name or ID of item, ideally both human-readable and unique"
     uint score;         	"Score (0-1000)"
     char[1] strand;     	"+ or - for strand"
     uint thickStart;    	"Start of where display should be thick (start codon)"
     uint thickEnd;      	"End of where display should be thick (stop codon)"
     uint reserved;       	"RGB value (use R,G,B string in input file)"
     int blockCount;     	"Number of blocks"
     int[blockCount] blockSizes; "Comma separated list of block sizes"
     int[blockCount] chromStarts;"Start positions relative to chromStart"
     string name2;       	"Alternative/human readable name"
     string cdsStartStat; 	"Status of CDS start annotation (none, unknown, incomplete, or complete)"
     string cdsEndStat;   	"Status of CDS end annotation (none, unknown, incomplete, or complete)"
     int[blockCount] exonFrames; "Exon frame {0,1,2}, or -1 if no frame for exon"
     string type;        	"Transcript type"
     string geneName;    	"Primary identifier for gene"
     string geneName2;   	"Alternative/human-readable gene name"
     string geneType;    	"Gene type"
     )  </code></pre>
 
 <p>
-The following bed12+8 is an example of a <a href="examples/bigGenePred.txt">bigGenePred input file
+The following bed12+8 is an example of a <a href="examples/bigGenePred.txt">pre-bigGenePred text file
 </a>.</p>
 
-<h2>Creating a bigGenePred track</h2>
-<p>
-To create a bigGenePred track, follow these steps:</p>
+<h2>Creating a bigGenePred track from a bed12+8 file</h2>
 <p>
 <strong>Step 1.</strong> 
-Format your bigGenePred file. The first 12 fields of the bigGenePred bed12+8 format are described by the
-basic <a href="../../FAQ/FAQformat.html#format1">BED file format</a>. You can also read
-the <a href="../../FAQ/FAQformat.html#format9">genePred format</a>. 
-Your bigGenePred file must also contain the 8 extra fields described in the autoSql file definition 
+Format your pre-bigGenePred file. The first 12 fields of pre-bigGenePred files are described by the
+<a href="../../FAQ/FAQformat.html#format1">BED file format</a>. Your file must
+also contain the 8 extra fields described in the autoSql file definition 
 shown above: <code>name2, cdsStartStat, cdsEndStat, exonFrames, type, geneName, geneName2, 
-geneType</code>. For reference, you can use this example bed12+8 input file, 
-<a href="examples/bigGenePred.txt">bigGenePred.txt</a>. Your bigGenePred file must be sorted 
+geneType</code>. For example, you can use this bed12+8 input file, 
+<a href="examples/bigGenePred.txt">bigGenePred.txt</a>. Your pre-bigGenePred file must be sorted 
 first on the <code>chrom</code> field, and secondarily on the <code>chromStart</code> field. You
 can use the UNIX <code>sort</code> command to do this:</p> 
 <pre><code>sort -k1,1 -k2,2n unsorted.bed &gt; input.bed</code></pre>
 <p>
 <strong>Step 2.</strong> 
 Download the <code>bedToBigBed</code> program from the 
 <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">binary utilities directory</a>.</p>
 <p>
 <strong>Step 3.</strong> 
-Download the <em>chrom.sizes</em> file for your genome assembly using the <code>fetchChromSizes</code>
-script from the <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">same directory</a>.
-Alternatively, you can download the <em>chrom.sizes</em> file for any assembly hosted at UCSC from 
+Download the <em>chrom.sizes</em> file for your assembly from
 our <a href="http://hgdownload.soe.ucsc.edu/downloads.html">downloads</a> page (click on &quot;Full
-data set&quot; for any assembly). For example, the <em>hg38.chrom.sizes</em> file for the hg38
+data set&quot; for your organism). For example, the <em>hg38.chrom.sizes</em> file for the hg38
 database is located at
 <a href="http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes"
-target="_blank">http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes</a>.</p>
+target="_blank">http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes</a>.
+Alternatively, you can use the <code>fetchChromSizes</code> script from the 
+<a href="http://hgdownload.soe.ucsc.edu/admin/exe/">utilities directory</a>.
 <p>
 <strong>Step 4.</strong> 
-Create the bigGenePred file from your sorted input file using the <code>bedToBigBed</code> 
+Create the bigGenePred file from your pre-bigGenePred file using the <code>bedToBigBed</code> 
 utility command:</p> 
 <pre><code><strong>bedToBigBed</strong> -as=bigGenePred.as -type=bed12+8 bigGenePred.txt chrom.sizes myBigGenePred.bb</code></pre>
 <p>
 <strong>Step 5.</strong> 
 Move the newly created bigGenePred file (<em>myBigGenePred.bb</em>) to a web-accessible http, https,
 or ftp location. See <a href="hgTrackHubHelp.html#Hosting">hosting section</a> if necessary.</p>
 <p>
 <strong>Step 6.</strong> 
 Construct a <a href="hgTracksHelp.html#CustomTracks">custom track</a> using a single 
-<a href="hgTracksHelp.html#TRACK">track line</a>. Note that any of the track attributes listed
-<a href="customTrack.html#TRACK">here</a> are applicable to tracks of type bigBed. The basic 
-version of the track line will look something like this:</p>
+<a href="hgTracksHelp.html#TRACK">track line</a>. Any of the track attributes will be
+available for use on bigBed tracks. The basic version of the track line will look something 
+like this:</p>
 <pre><code>track type=bigGenePred name="My Big GenePred" description="A Gene Set Built from Data from My Lab" bigDataUrl=http://myorg.edu/mylab/myBigGenePred.bb</code></pre>
 <p>
 <strong>Step 7.</strong> 
 Paste this custom track line into the text box on the <a href="../../cgi-bin/hgCustom">custom 
-track management page</a>.</p>
-<p>
-The <code>bedToBigBed</code> program can be run with several additional options. For a full list of 
-the available options, type <code>bedToBigBed</code> (with no arguments) on the command line to 
-display the usage message.</p>
+track management page</a>. Click <button>Go</button> and you should be taken to a </p>
 
 <h2>Examples</h2>
 <a name="Example1"></a>
-<h3>Example &num;1: Create Custom Track</h3>
+<h3>Example &num;1: Create a Custom Track</h3>
 <p>
-In this example, you will create a bigGenePred custom track using a bigGenePred file located 
+Create a bigGenePred custom track using the bigGenePred file located 
 on the UCSC Genome Browser http server, <em>bigGenePred.bb</em>. This file contains data for
 the hg38 assembly.</p>
-<p>
-To create a custom track using this bigGenePred file:
 <ol>
   <li>
   Construct a track line that references the hosted file:</p>
   <pre><code>track type=bigGenePred name=&quot;bigGenePred Example One&quot; description=&quot;A bigGenePred file&quot; bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.bb</code></pre>
   <li>
   Paste the track line into the <a href="../../cgi-bin/hgCustom?db=hg38">custom track management 
-  page</a> for the human assembly hg38 (Dec. 2013).</li> 
+  page</a> for the human assembly hg38.</li> 
   <li>
   Click the <button>Submit</button> button.</li> 
 </ol>
 <p>
 Custom tracks can also be loaded via one URL line. The link below loads the same bigGenePred track 
 and sets additional parameters in the URL:</p>
-<a href="http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hgct_customText=track&type=bigGenePred&bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.bb"
-target="_blank"><pre><code>http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hgct_customText=track&type=bigGenePred&bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.bb</code></pre></a>
+<a href="http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hgct_customText=track%20type=bigGenePred%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.bb"
+target="_blank"><pre><code>http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&amp;hgct_customText=track%20type=bigGenePred%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.bb</code></pre></a>
 <p>
-After this example bigGenePred track is loaded in the Genome Browser, click on a gene in the
-browser's track display to view the details page for that gene. Note that the page offers links to
-several sequence types, including translated protein, predicted mRNA, and genomic sequence.</p>
+After this example bigGenePred track is loaded in the Genome Browser, click on the track to 
+change display from dense to pack, then click on a gene in the browser's track display to view 
+the gene details page. Note that the page offers links to translated protein, predicted mRNA,
+and genomic sequence.</p>
+
 <a name="Example2"></a>
 <h3>Example &num;2: Display Amino Acids</h3>
 <p>
 In this example, you will configure the bigGenePred track loaded in Example #1 to display codons and
 amino acid numbering:
 <ol>
   <li>
-  On the bottom of the gene details page, click the &quot;Go to ... track controls&quot; link.</li>
+  On the bottom of the gene details page, click the &quot;Go to User Track track controls&quot; link.</li>
   <li>
   Change the &quot;Color track by codons:&quot; option from &quot;OFF&quot; to &quot;genomic 
-  codons&quot; and check that the display mode is set to &quot;full&quot. Then click 
+  codons&quot; and change the display mode to &quot;full&quot;. Then click 
   <button>Submit</button>. 
   <li>
-  On the Genome Browser tracks display, zoom to a track region where amino acids display, such as 
-  <code>chr9:133,255,650-133,255,700</code>, and note that the track now displays codons. 
+  Zoom to a region with track data, such as 
+  <code>chr9:133,255,650-133,255,700</code>, and note that the track now displays amino acids. 
   <li>
   Return to the track controls page and click the box next to &quot;Show codon numbering&quot;, 
-  then click <button>Submit</button>.</li> 
-  <li>
-  The browser tracks display will now show amino acid numbering.</li>
+  then click <button>Submit</button>. The browser tracks display will now show amino acid
+  letters and numbering.</li>
 </ol>
 <p>
-You can also add a parameter in the custom track line, <code>baseColorDefault=genomicCodons</code>, 
-to set display codons by default:</p>
+Alternatively, you can also add a parameter in the custom track line, <code>baseColorDefault=genomicCodons</code>, 
+to set amino acids to display by default:</p>
 <pre><code>browser position chr10:67,884,600-67,884,900
 track type=bigGenePred baseColorDefault=genomicCodons name="bigGenePred Example Two" description="A bigGenePred file" visibility=pack bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.bb </code></pre>
 <p>
 Paste the above into the hg38 <a href="../../cgi-bin/hgCustom?db=hg38">custom track management 
 page</a> to view an example of bigGenePred amino acid display at the beginning of the SIRT1 gene on 
 chromosome 10.</p>
 <p class='text-center'>
   <img class='text-center' src="../../images/codonColoring.png" alt="An image of a track with codons colored" width="999"> 
 </p>
 <a name="Example3"></a>
-<h3>Example &num;3: Pre-bigGenePred to BigGenePred</h3>
+<h3>Example &num;3: Bed12+8 to BigGenePred</h3>
 <p>
 In this example, you will create your own bigGenePred file from an existing pre-bigGenePred input
-file.</p>
+file, a bed12+8 file.</p>
 <ol>
   <li> 
   Save the example bed12+8 input file to your computer,
   <em><a href="examples/bigGenePred.txt">bigGenePred.txt</a></em>.</li>
   <li> 
   Download the <a href="http://hgdownload.soe.ucsc.edu/admin/exe/"><code>bedToBigBed</code></a>
   utility (<em>Step 2</em>, in the <em>Creating a bigGenePred</em> section above).</li>
   <li> 
   Save the <a href="hg38.chrom.sizes"><em>hg38.chrom.sizes</em> text file</a> to your computer.
   This file  contains the chrom.sizes for the human hg38 assembly (<em>Step 3</em>, above).</li>
   <li> 
   Save the autoSql file <a href="examples/bigGenePred.as"><em>bigGenePred.as</em></a> to your 
   computer.</li>
   <li> 
   Run the <code>bedToBigBed</code> utility to create the bigGenePred output file (<em>step 4</em>, 
   above):
   <pre><code><B>bedToBigBed</B> -type=bed12+8 -tab -as=bigGenePred.as bigGenePred.txt hg38.chrom.sizes bigGenePred.bb</code></pre></li>
   <li> 
   Place the newly created bigGenePred file (<em>bigGenePred.bb</em>) on a web-accessible server 
   (<em>Step 5</em>, above).</li>
   <li> 
   Construct a track line that points to the bigGenePred file (<em>Step 6</em>, above).</li>
   <li> 
   Create the custom track on the human assembly hg38 (Dec. 2013), and view it in the Genome Browser 
   (<em>step 7</em>, above).</li>
 </ol>
 <a name="Example4"></a>
-<h3>Example &num;4: GTF to BigGenePred </h3>
+<h3>Example &num;4: GTF (or GFF) to BigGenePred </h3>
 <p>In this example, you will convert a GTF file to bigGenePred using command line utilities.
 You will need <em>gtfToGenePred</em>, <em>genePredTobigGenePred</em>,
-and <em>bedToBigBed</em>. You can download utilities from the
+and <em>bedToBigBed</em>. If you would like to convert a GFF file to bigGenePred, you will use
+<em>gff3ToGenePred</em> in place of the <em>gtfToGenePred</em>. You can download utilities from the
 <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">utilities directory</a>.</p>
 <ol>
   <li>
-  Obtain a GTF file using the wget command. Skip this step if you already have a GTF file.
+  Obtain a GTF file using the wget command. Skip this step if you already have a GTF or GFF file.
   <pre><code>wget http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredExample4.gtf</code></pre></li>
   <li>
   Convert the GTF file to genePred extended format using the gtfToGenePred command.
-  <pre><code>gtfToGenePred -genePredExt bigGenePredExample4.gtf example4.genePred</code></pre></li>
+  <pre><code>gtfToGenePred -genePredExt bigGenePredExample4.gtf example4.genePred</code></pre>
+  If you are converting a GFF file, use the gff3ToGenePred command.
+  <pre><code>gff3ToGenePred yourFile.gff example4.genePred </pre></code></li>
   <li>
   Convert the genePred extended file to a pre-bigGenePred text file.
   <pre><code>genePredToBigGenePred example4.genePred bigGenePredEx4.txt</code></pre></li>
   <li>
-  Obtain helper files for the conversion from pre-bigGenePred to binary bigGenePred.
+  Obtain helper files for the conversion to binary bigGenePred. If you are not using hg38, 
+  you can find the chrom.sizes file for your organism in the
+  <a href="http://hgdownload.soe.ucsc.edu">downloads directory</a> under "Full data set".
   <pre><code>wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
 wget http://genome.ucsc.edu/goldenPath/help/examples/bigGenePred.as</code></pre></li>
   <li>
   Convert your text bigGenePred to a binary indexed format.
   <pre><code>bedToBigBed -type=bed12+8 -tab -as=bigGenePred.as bigGenePredEx4.txt hg38.chrom.sizes bigGenePredEx4.bb</code></pre></li>
   <li> 
   Put your binary indexed file in a web-accessible location. See the 
   <a href="https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html#Hosting">hosting section</a> for more information.</li>
   <li>
   To view this example, you can click this into this Browser link. To view your own data, paste the
   link into your web browser and replace the URL after "bigDataUrl=" with a link to your own 
   hosted data.
-<pre> <a href="http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr19:44905790-44909388&hgct_customText=track&type=bigGenePred&bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredEx4.bb">http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr19:44905790-44909388&hgct_customText=track&type=bigGenePred&bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredEx4.bb</a></pre>
+<pre> <a href="http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr19:44905790-44909388&hgct_customText=track%20type=bigGenePred%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredEx4.bb">http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&position=chr19:44905790-44909388&hgct_customText=track%20type=bigGenePred%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredEx4.bb</a></pre>
 You can also add your data in the <a href="../../cgi-bin/hgCustom?db=hg38">custom track management
 page</a>. This allows you to set position, configuration options, and write a more complete 
 desciption. If you want to see codons, you can right click, then click configure codon view or
 set this options using <code>baseColorDefault=genomicCodons</code> as is done below.
 <pre><code>browser position chr19:44905790-44909388 
 track type=bigGenePred baseColorDefault=genomicCodons name="bigGenePred Example Four" description="BGP Made from genePred" visibility=pack bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigGenePredEx4.bb</code></pre></li>
 </ol>
+<p class='text-center'>
+  <img class='text-center' src="../../images/bigGenePredEx4.png" alt="An image of a bigGenePred track on the browser" width="999">
+</p>
 <h2>Sharing your data with others</h2>
 <p>
 If you would like to share your bigGenePred data track with a colleague, learn how to create a URL
-link to your data by looking at <a href="customTrack.html#EXAMPLE6">Example #6</a>.</p>
+link to your data by looking at <a href="customTrack.html#EXAMPLE6">Example #6</a> on the 
+custom track help page.</p>
 
 <h2>Extracting data from bigBed format</h2>
 <p>
 Because the bigGenePred files are an extension of bigBed files, which are indexed binary files, it 
 can be difficult to extract data from them. UCSC has developed the following programs to
 assist in working with bigBed formats, available from the 
 <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">binary utilities directory</a>.</p>
 <ul>
   <li>
   <code>bigBedToBed</code> &mdash; converts a bigBed file to ASCII BED format.</li>
   <li>
   <code>bigBedSummary</code> &mdash; extracts summary information from a bigBed 
   file.</li>
   <li>
   <code>bigBedInfo</code> &mdash; prints out information about a bigBed file.</li>
 </ul>
 <p>
 As with all UCSC Genome Browser programs, simply type the program name (with no parameters) at the 
 command line to view the usage statement.</p>
 
 <h2>Troubleshooting</h2>
 <p>
 If you encounter an error when you run the <code>bedToBigBed</code> program, check your input 
 file for data coordinates that extend past the end of the chromosome. If these are 
 present, run the <code>bedClip</code> program 
 (<a href="http://hgdownload.soe.ucsc.edu/admin/exe/">available here</a>) to remove the problematic 
 row(s) before running the <code>bedToBigBed</code> program. </p>
 
 <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->
 <!DOCTYPE html>
 <!--#set var="TITLE" value="Genome Browser bigGenePred Track Format" -->
 <!--#set var="ROOT" value="../.." -->