a080d8c22fcd1ac8636dcd6f08b064214a416903
dschmelt
  Tue Apr 13 12:17:28 2021 -0700
Edits to new doc page #25532

diff --git src/hg/htdocs/goldenPath/help/trackDbIndexBb.html src/hg/htdocs/goldenPath/help/trackDbIndexBb.html
index 197f88e..9cdde9b 100755
--- src/hg/htdocs/goldenPath/help/trackDbIndexBb.html
+++ src/hg/htdocs/goldenPath/help/trackDbIndexBb.html
@@ -1,156 +1,156 @@
 <!DOCTYPE html>
-<!--#set var="TITLE" value="Genome Browser Trix Indices" -->
+<!--#set var="TITLE" value="trackDbIndexBb | UCSC Genome Browser" -->
 <!--#set var="ROOT" value="../.." -->
 
 <!-- Relative paths to support mirror sites with non-standard GB docs install -->
 <!--#include virtual="$ROOT/inc/gbPageStart.html" -->
 
-<h1>trackDbIndexBb</h1> 
+<h1>trackDbIndexBb - A utility to support hideEmptySubtracks on Composite Tracks</h1> 
 <p>
 When there are many subtracks in a composite view, it may be useful to limit the 
 display to only those with data in the current viewing window. The trackDb setting, 
 <a target="_blank" 
 href="/goldenPath/help/trackDb/trackDbHub.html#hideEmptySubtracks">hideEmptySubtracks</a>,
 enables this behavior. This track setting produces a checkbox on the track configuration 
 page allowing the user to enable or disable this feature. If it is configured to 'on', 
 then the feature will be on by default (the checkbox is checked). To take 
 full advantage of this setting it is helpful, though not always required, to index the 
 underlying bigBed files, using the <code>trackDbIndexBb</code> utility. This utility
 creates multibed/index files containing the coordinates where the tracks intersect,
 expediting data lookup. There are two instances in which these files are helpful or required:</p>
 <ul>
 <li><b>(helpful)</b> In large composites (dozens or hundreds of tracks), especially when subtrack
 data are sparse, the index files will provide a substantial performance improvement.</li>
 <li><b>(required)</b> For building track associations. An example of this is when it is desired
 to display peak and signal tracks together. Because <code>hideEmptySubtracks</code> works
 only on bigBed tracks (peaks), associated tracks (such as bigWig signals) can be designated
 to be displayed alongside the bigBeds with data.</li>
 </ul>
 
 <p>
 To build the index files, first download the <code>trackDbIndexBb</code> utility. 
 For more information on downloading our command line utilities, see these 
-<a href="http://hgdownload.soe.ucsc.edu/downloads.html#source_downloads">instructions</a>.</p> 
+<a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">instructions</a>.</p>
 <p>
 There are three other programs needed to run <code>trackDbIndexBb</code>. Two of them,
 <code>bedToBigBed</code> and <code>bigBedToBed</code>, can be found
-in the same <a href="http://hgdownload.soe.ucsc.edu/downloads.html#source_downloads">
+in the same <a href="http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">
 downloads directory</a>.
 The final dependency, <b>bedtools</b>, can be found on the <a target="_blank" 
 href="https://bedtools.readthedocs.io/en/latest/">bedtools site</a>.</p>
 
 <h2>Parameters</h2>
 <p>
 Kent utilities can be run with no parameters to display a usage message. 
 Additionally, <code>trackDbIndexBb</code> can be passed the
 <code>-	h</code> flag to display a more verbose help message.</p>
 <pre>
 ./trackDbIndexBb
 ./trackDbIndexBb -h
 </pre>
 <p> 
 Below is a short description of the parameters:</p>
 <ul>
 <li><b>trackName</b> | The name of the composite containing the bigBed tracks
 to be indexed. Higher-level composite names may be used to build track associations.
-This means that all tracks in the heirarchy <em>below</em> trackName will be searched 
+This means that all tracks in the hierarchy <em>below</em> trackName will be searched 
 for indexing and associations.</li>
 <li><b>raFile</b> | The location of the trackDb.ra file containing the bigDataUrls
 of the bigBeds to be indexed.</li>
 <li><b>chromSizes</b> | Location of the chrom.sizes file. This file contains the names
 and sizes of the chromosomes for the working assembly. This can be generated from the 2bit genome, 
 downloaded from the respective assembly on our <a target="_blank" 
 href="http://hgdownload.soe.ucsc.edu/downloads.html">download server</a>, or fetched using the 
-<code>fetchChromSizes</code> utility found in the same directory as <code>trackDbIndexBb</code>.</li>
+<code>fetchChromSizes</code> utility found in the same directory as <code>trackDbIndexBb</code>.
+</li>
 <li><b>-o --out</b> | (Optional) Path to the output directory where the resulting files will 
 be placed. Defaults to current directory.</li>
 <li><b>-p --pathTools</b> | (Optional) Location where the dependent programs can be found. 
 Will automatically check current directory and user's path. <code>trackDbIndexBb</code> 
 has three dependencies listed above. Note that bedtools must be downloaded from an 
 external group.</li>
 <li><b>-n --noDelete</b> | (Optional) Keep intermediary multiBed file.</li>
 <li><b>-m --metaDataVar</b> | (Optional) Used only when building track associations. The variable
 designated here will be a trackDb variable which can be used to associate tracks. Default value is
 <em>subGroups</em>, though <em>metaData</em> is also commonly used. See example below for more 
 information.</li>
 <li><b>-s --subGroupRemove</b> | (Optional) Used only when building track associations. In 
 conjunction with <b>--metaDataVar</b>, this variable is used to build track associations. The
 value designated will be excluded as a matching requirement from the trackDb parameter. By 
 default, <em>view</em> is used, though this can change depending on data and organization. 
 See example below for more information.</li>
 </ul>
 
-<h2>Example 1</h2>
+<h2>Example 1 - Building Index Files</h2>
 <p>
-In this first example, we are looking to build index files for a composite track
-containing 12 bigBed files. Index files are not required for performance
-reasons for a track with so few files, however, the steps would be the same
+In this first example, we will build index files for a composite track
+containing 12 bigBed files. Index files hardly improve performance
+on a track with so few files, however, the steps would be the same
 on a larger track.</p>
 <p>
 First, we can take a look at the header stanza for the composite. The complete
 trackDb.ra file is available <a target=_blank" 
 href="/goldenPath/help/examples/trackDbIndexBb/smallExampleTrackDb.ra">here</a>.</p>
 <pre>
 track problematic
 shortLabel Problematic Regions
 longLabel Problematic Regions for NGS or Sanger sequencing or very variable regions
 compositeTrack on
 hideEmptySubtracks off
 group map
 visibility hide
 type bigBed 3 +
 </pre>
 <p>
 We can see that the <code>hideEmptySubtracks</code> setting is already enabled,
-set off by default. The index files we are building are not required, but instead
-improve the performance of the feature. The other key information here is
+set off by default. The index files we are building are not required, but instead,
+improve the performance of the feature. This also gives us
 the composite track name, <em>problematic</em>. We will want to pass this
 as our <b>trackName</b> variable.</p>
 <p>
 The other two required parameters are the path to the trackDb.ra file, 
 and the chrom.sizes file. If we assume both of those are in the current directory,
 and that the required dependencies are present in the path, we can run
 <code>trackDbIndexBb</code> as such:</p>
 <pre>
 ./trackDbIndexBb problematic smallExampleTrackDb.ra hg19.chrom.sizes
 </pre>
 <p>
-This will result in two files being generated in the current directory:</p>
+This will generate two files in the current directory:</p>
 <pre>
 problematic.multiBed.bb
 problematic.multiBedSources.tab
 </pre>
 <p>
 We can then enable the use of these index files for <code>hideEmptySubtracks</code> by
-adding the following two lines to our trackDb.ra file, adjusting the path to the file
-if needed:</p>
+adding the following two lines to our trackDb.ra file, adjusting the path if needed:</p>
 <pre>
 hideEmptySubtracks off
 hideEmptySubtracksMultiBedUrl problematic.multiBed.bb
 hideEmptySubtracksSourcesUrl problematic.multiBedSources.tab
 </pre>
 
-<h2>Example 2</h2>
+<h2>Example 2 - Creating Track Associations</h2>
 <p>
 In this longer example, we are looking to build index files with track associations between 
-DNase-seq peak and signal tracks. There are 2 bigBed peak tracks, and 4 bigWig signal tracks.
+DNase-seq peak and signal tracks. There are 2 bigBed peak tracks and 4 bigWig signal tracks.
 The complete trackDb for the example can be found <a target=_blank" 
 href="/goldenPath/help/examples/trackDbIndexBb/exampleTrackDb.ra">here</a>.</p> 
 <p>
 Looking at the
-top level stanza, we see that the track is a composite track with two views, one for
+top level stanza, we see that this composite track has two views, one for
 peaks and one for signals. The data are associated with a few different subGroups:</p>
 <pre>
 track uniformDnase
 subGroup4 lab Lab Duke=Duke UW=UW UWDuke=UW-Duke
 subGroup3 view View Peaks=Peaks Signal=Signal
 subGroup2 cellType Cell_Line GM12878=GM12878 H1-hESC=H1-hESC
 </pre>
 <p>
 To help us decide how to best make these associations, let us see what parts of
 the peak and signal stanzas we would like to associate are relevant:</p>
 <pre>
                 track wgEncodeUWDukeDnaseGM12878FdrPeaks
                 type bigBed 6 +
                 parent uniformDnasePeaks on
                 bigDataUrl wgEncodeUWDukeDnaseGM12878.fdr01peaks.hg19.bb
@@ -160,47 +160,48 @@
                 track wgEncodeDukeDnaseGM12878FdrSignal
                 type bigWig
                 parent uniformDnaseSignal on
                 bigDataUrl wgEncodeOpenChromDnaseGm12878Aln_5Reps.norm5.rawsignal.bw
                 subGroups view=Signal tier=t1 cellType=GM12878 lab=Duke
                 metadata cell=GM12878 lab=Duke
 
                 track wgEncodeUWDnaseGM12878FdrSignal
                 type bigWig
                 parent uniformDnaseSignal on
                 bigDataUrl wgEncodeUwDnaseGm12878Aln_2Reps.norm5.rawsignal.bw
                 subGroups view=Signal tier=t1 cellType=GM12878 lab=UW
                 metadata cell=GM12878 lab=UW
 </pre>
 <p>
-The first track is the bigBed peaks track, part of the peaks view, and the second
-and third are bigWig signal tracks, part of the signal view. <code>hideEmptySubtracks</code>
+The first track is the bigBed peaks track (peaks view) and the second
+and third are bigWig signal tracks (signal view). <code>hideEmptySubtracks</code>
 allows for two optional variables to build track associations. The first, <b>-m --metaDataVar</b>,
 designates which trackDb variable will be used to build the association. In this example, the peaks
-are called on a combination of the signal tracks for each cell, because of this, we would like to
+are determined using a combination of the signal tracks, therefore, we would like to
 display both of the signal tracks whenever the peak track has data.</p>
 <p>
-At this point it is important to explain how <code>trackDbIndexBb</code> makes track associations.
+At this point, it is important to explain how <code>trackDbIndexBb</code> makes track associations.
 It will look at the stanza variable line designated by <b>-m --metaDataVar</b>, then look for
-identical matching lines in other stanzas. Since at least one parameter within the line will usually 
+identical matching lines in other stanzas. Since at least one parameter within will usually
 differ, such as the designation between peak and signal, <b>-s --subGroupRemove</b> can be used 
 to strip out one of the parameters in the line.</p>
 <p>
-The <b>subGroups</b> parameter could be used. However, we see that there are two variables that 
-differ between the peak and signal stanzas, <em>view</em> and <em>lab</em>. We would have to strip 
-both of those to have matching parameter variables and build an association. On the 
-other hand, we could use the <b>metaData</b> parameter. This parameter associates the tracks 
+The <b>subGroups</b> parameter could be used. However, we see that the two variables that 
+differ between the peak and signal stanzas are <em>view</em> and <em>lab</em>. 
+We would have to strip 
+both of those to have matching parameter variables and build an association. Alternatively, 
+we could use the <b>metaData</b> parameter. This parameter associates the tracks 
 by the <em>cell</em>, with only the <em>lab</em> variable differing. This would be the best 
 choice as only a single parameter would have to be stripped, <em>lab</em>, as opposed to two,
 <em>lab</em> and <em>view</em>, to have matching peak and signal parameters for related tracks.</p>
 <p>
 Now that we know which parameter we would like to use to build associations, we need to use the
 second optional parameter, <b>-s --subGroupRemove</b>, to tell <code>hideEmptySubtracks</code>
 which variables to strip out in making the association. In this case, we would like to
 keep the <em>cell</em> variable, but strip the <em>lab</em>. This means that <em>lab</em> 
 will be the parameter passed. In this way, associations will be made between any tracks that 
 match the contents of their <b>metaData</b> parameter once the <em>lab</em> variable has been 
 stripped out.</p>
 <p>
 Now that we have chosen our parameters, we will run the utility -- assuming our chrom.sizes file,
 our trackDb.ra file, and all the supporting programs (bedToBigBed, bigBedToBed, bedtools) are
 present in the current directory. We will also choose the output to be the current directory:</p>