539de8508531fab6a834a300830e17b6b8e64afa
dschmelt
  Tue Sep 24 15:11:14 2019 -0700
Adding a first draft of the searchable Track hub documentaion, still needing to correct paths #20881

diff --git src/hg/htdocs/goldenPath/help/hubQuickStartSearch.html src/hg/htdocs/goldenPath/help/hubQuickStartSearch.html
new file mode 100755
index 0000000..1287d5f
--- /dev/null
+++ src/hg/htdocs/goldenPath/help/hubQuickStartSearch.html
@@ -0,0 +1,173 @@
+<!DOCTYPE html>
+<!--#set var="TITLE" value="Track Hub Quick Start" -->
+<!--#set var="ROOT" value="../.." -->
+
+<!-- Relative paths to support mirror sites with non-standard GB docs install -->
+<!--#include virtual="$ROOT/inc/gbPageStart.html" -->
+
+<h1>Searchable Track Hub Quick Start Guide</h1> 
+<p>
+Track Hubs are a method of displaying remotely-hosted annotation data quickly and flexibly on any 
+UCSC assembly or remotely-hosted sequence with Assembly Hubs. Making your annotation data searchable
+is an important improvement to the usability of your hub, especially if your annotations are not
+otherwise represented on the Browser. This Quick Start Guide will
+go through making a searchable track hub from a GFF3 file, converting to a genePred, bed, and 
+bigBed, then creating a trix search index file. This example will be made with the new 
+"useOneFile" feature to avoid any need for separate genome.txt and trackDb.txt files.</p>
+<p>
+<strong>STEP 1: Downloads</strong> In a publicly-accessible directory (such as a university server, 
+CyVerse, or GitHub) copy the hub.txt file using the following command:
+<pre><code>wget http://genome.ucsc.edu/goldenPath/help/examples/ADD PATH HERE/</code></pre>
+<p>
+Alternatively, you can use curl or copy and paste the hub.txt file manually in a text editor:<br>
+<pre><code>curl -O http://genome.ucsc.edu/goldenPath/help/examples/hubDirectory/PATH</code></pre>
+Download some example gene data from Gencode:
+<pre><code>wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.basic.annotation.gff3.gz</code></pre>
+<p>
+Finally, you will need to download four Genome Browser utilities to convert the GFF3 file to a 
+binary indexed bigBed format and run the search index command.</p>
+<table>
+  <tr>
+    <th>Utility Name</th>
+    <th>MacOS Download</th>
+    <th>Linux Download</th>
+  </tr>
+  <tr>
+    <td>gff3ToGenePred</td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/macOSX.x86_64/gff3ToGenePred">MacOS Download</a></td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/gff3ToGenePred">Linux Download</a></td>
+  </tr>
+  <tr>
+    <td>genePredToBed</td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/macOSX.x86_64/genePredToBed">MacOS Download</a></td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/genePredToBed">Linux Download</a></td>
+  </tr>
+  <tr>
+    <td>bedToBigBed</td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/macOSX.x86_64/bedToBigBed">MacOS Download</a></td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedToBigBed">Linux Download</a></td>
+  </tr>
+  <tr>
+    <td>IxIxx</td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/macOSX.x86_64/ixIxx">MacOS Download</a></td>
+    <td><a href="http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/ixIxx">Linux Download</a></td>
+  </tr>
+</table>
+
+<p>
+<strong>STEP 2: Format Data</strong> 
+In order to format the data, you will need to run a command to make those commands executable.
+<pre><code>chmod +x gff3ToGenePred genePredToBed bedToBigBed IxIxx</code></pre>
+gene symbol instead of ID number, and sorting by chromosome and position.
+<pre><code>gff3ToGenePred -geneNameAttr=gene_name gencode.v31.basic.annotation.gff3.gz stdout \
+| sort -k2,2 -k4n,4n > gencode.v31.basic.genePred </code></pre>
+
+Convert that genePred file to a bed file with the following command:
+<pre><code>genePredToBed gencode.v31.basic.genePred gencode.v31.basic.bed</code></pre>
+
+Compress and index that bed file into a bigBed format, adding the extraIndex to allow name 
+(gene symbol) searches:
+<pre><code>bedToBigBed -extraIndex=name gencode.v31.basic.bedSorted  https://genome.ucsc.edu/goldenPath/help/hg38.chrom.sizes gencode.v31.basic.bb</code></pre>
+
+<strong>STEP 3: Create Search Index</strong> 
+This step is only neccesary if you want to link your annotation names to anything other that 
+what was mentioned in the extraIndex command, in this case name (gene symbol).
+We will make an index file which will link one identifier in the file with search terms
+composed of gene IDs and partial versions of the gene symbols. This is the input file for the
+search indexing command:
+<pre><code>cat gencode.v31.basic.genePred | awk '{print $1, " " substr ($12, 0, 3), substr ($12, 0, 4), substr ($12, 0, 5), substr ($12, 0, 6), substr ($12, 0, 7), substr ($12, 0, 8)}' > index.txt</code></pre>
+To examine this file or to skip this step, you can click the following link. Note that the first
+word is the key referenced in the bed file and the following terms are associated values that 
+you want to be searchable to the location of the key.
+<a href="PATH TO index.txt">index.txt</a>
+Finally you will make the index file (.ix) and the index of that index (.ixx) which helps the
+return search results quickly even in large files.
+<pre><code>ixIxx index.txt out.ix out.ixx</code></pre>
+
+<strong>STEP 4: View and Search</strong> Enter the URL to your hub on the My Hubs tab of the 
+<a href="../../cgi-bin/hgHubConnect#unlistedHubs">Track Data Hubs</a> page. Alternately, you can
+enter your hub.txt URL in the following URL:
+LINK
+If you would like to look at an already-made example, click the following link:
+LINK
+
+IMAGE
+
+Once your hub displays, you should be able to type in a gene symbol or Enst ID and scroll down the results
+page until you see your search results. 
+
+
+<p>
+If you are having problems, be sure all your files are publicly-accessible and that your server
+accepts byte-ranges. You can check using the following  command to verify &quot;Accept-Ranges: bytes&quot; displays:</p>
+<pre><code>curl -IL http://yourURL/hub.txt</code></pre>
+
+<p>
+Note that the Browser waits 5 minutes before checking for any changes to these files. <strong>When 
+editing hub.txt, genomes.txt,and trackDb.txt, you can shorten this delay by adding 
+<code>udcTimeout=1</code> to your URL.</strong> For more information, see the 
+<a href="hgTrackHubHelp.html#Debug" target="_blank">Debugging and Updating Track Hubs</a> section of
+the <a href="hgTrackHubHelp.html" target="_blank">Track Hub User Guide</a>.</p>
+<p>
+<strong>For more detailed instructions on setting up a hub, refer to the 
+<a href="hgTrackHubHelp.html#Setup" target="_blank">Setting Up Your Own Track Hub</a> section of the
+Track Hub User Guide.</strong>
+
+
+<!-- ========== hub.txt ============================== -->
+<a name="hub.txt"></a>
+<h2>Understanding hub.txt with useOneFile</h2>
+<p>
+The hub.txt file is a configuration file with names, descriptions, and paths to other files,
+The example below uses the setting "useOneFile on" to indicate that all the settings and paths
+appear in only the hub.txt file as opposed to having two additional settings files (genome.txt and
+trackDb.txt).</p>
+</br>
+<p>
+The most important settings to make the hub searchable appear in the third section, in what would
+formerly be the trackDb.txt files. The settings searchIndex and searchTrix indicate which fields
+are indexed in the bigBed file and where to find the .ix file respectively.</p>
+
+<pre><code><strong>hub</strong> <em>MyHubsNameWithoutSpaces</em>
+<strong>shortLabel</strong> <em>My Hub's Name</em>
+<strong>longLabel</strong> <em>Name up to 80 characters versus shortLabel limited to 17 characters</em>
+<strong>genomesFile</strong> <em>genomes.txt</em>
+<strong>email</strong> <em>myEmail@address</em>
+<strong>descriptionUrl</strong> <em>aboutMyHub.html</em>
+<strong>useOneFile</strong> <em>on</em>
+<br>
+<strong>genome</strong> <em>assembly_database_2</em>
+<br>
+<strong>track</strong> <em>uniqueNameNoSpacesOrDots</em>
+<strong>type</strong> <em>track_type</em>
+<strong>bigDataUrl</strong> <em>track_data_url</em>
+<strong>shortLabel</strong> <em>label 17 chars</em>
+<strong>longLabel</strong> <em>long label up to 80 chars</em>
+<strong>visibiltiy</strong> <em>hide/dense/squish/pack/full</em>
+<strong>searchIndex</strong> <em>field,field2</em>
+<strong>searchTrix</strong> <em>path to .ix file</em>
+
+
+<h2>Additional Resources</h2>
+<ul>
+  <li>
+  <strong><a href="hgTrackHubHelp.html" target="_blank">Track Hub User
+Guide</a></strong></li> 
+  <li>
+  <strong><a href="trackDb/trackDbHub.html" target="_blank">Track Database (trackDb) Definition 
+  Document</a></strong></li> 
+  <li>
+  <strong><a href="http://genomewiki.ucsc.edu/index.php/Assembly_Hubs" target="_blank">Assembly Hubs
+   Wiki</a></strong></li>
+  <li>
+  <strong><a href="http://genomewiki.ucsc.edu/index.php/Public_Hub_Guidelines"
+  target="_blank">Public Hub Guidelines Wiki</a></strong></li>
+  <li>
+  <strong><a href="hubQuickStartGroups.html" target="_blank">Quick Start Guide to Organizing Track 
+  Hubs into Groupings</a></strong></li>
+  <li>
+  <strong><a href="hubQuickStartAssembly.html" target="_blank">Quick Start Guide to Assembly Track 
+  Hubs</a></strong></li>
+</ul>
+
+<!--#include virtual="$ROOT/inc/gbPageEnd.html" -->