fbeaf51d42a4e9306db9a5761730af8189300a7e max Fri Nov 24 06:59:47 2023 -0800 chinafying all other video links, refs #32535 diff --git src/hg/htdocs/goldenPath/help/gbib.html src/hg/htdocs/goldenPath/help/gbib.html index 596d9d2..57671d2 100755 --- src/hg/htdocs/goldenPath/help/gbib.html +++ src/hg/htdocs/goldenPath/help/gbib.html @@ -1,949 +1,953 @@ <!DOCTYPE html> <!--#set var="TITLE" value="GBiB" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> <h1> Genome Browser in a Box User's Guide</h1> <h2>Contents</h2> <h6><a href="#What">What is Genome Browser in a Box (GBiB)?</a></h6> <h6><a href="#GetStarted">Getting Started: Setting up Genome Browser in a Box</a></h6> <h6><a href="#UsingGbib">Using Genome Browser in a Box</a></h6> <h6><a href="#GbibMirroring">Improving speed and performance</a></h6> <h6><a href="#UpdatingGbib">Updating Genome Browser in a Box</a></h6> <h6><a href="#YourTracks">Viewing your own data</a></h6> <h6><a href="#CustomizingGbib">Customizing your GBiB</a></h6> <h6><a href="#SharingGBIB">Sharing Genome Browser in a Box with others</a></h6> <h6><a href="#UsrAcct">User accounts and sessions</a></h6> <h6><a href="#Trouble">Troubleshooting common problems</a></h6> <h6><a href="#Commands">Genome Browser in a Box commands</a></h6> <h6><a href="#License">Licensing information</a></h6> <hr> <p> <strong>Other resources:</strong> <ul><strong> <li><a class="toc" href="https://genome-blog.gi.ucsc.edu/blog/genome-browser-in-a-box-gbib-origins/" target="_blank">UCSC blog post about GBiB</a></li> +<!--#if expr="${SERVER_NAME} = /-china/" --> + <li><a class="toc" href="../../../videos/1DzZZgB1gvQ.mp4" target="_blank">GBiB introductory video</a></li> +<!--#else --> <li><a class="toc" href="https://www.youtube.com/watch?v=1DzZZgB1gvQ" target="_blank">GBiB introductory video</a></li> +<!--#endif --> </strong></ul> <a name="What"></a> <h2>What is Genome Browser in a Box?</h2> <p> Genome Browser in a Box (GBiB) is a "virtual machine" of the entire UCSC Genome Browser website that is designed to run on most PCs (Windows, Mac OSX or Linux). GBiB allows you to access much of the UCSC Genome Browser's functionality from the comfort of your own computer. It is particularly directed at individuals who want to use the Genome Browser toolset to view protected data. If it is not human sequencing reads, it usually does not fall under medical data privacy rules, and you should most likely not use GBiB but rather <a href="https://genome.ucsc.edu/goldenpath/help/hubQuickStartAssembly.html">assembly hubs</a> or <a href="https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html">track hubs</a>. See our <a href="https://genome.ucsc.edu/goldenPath/help/mirror.html#considerations-before-installing-a-genome-browser">mirror page</a> for adiscussion of the advantages/disadvantages of the different methods to customize your genome data display.</p> <h3>Differences between GBiB and the Genome Browser</h3> <p> While GBiB and the Genome Browser are similar in many ways, there are key differences. In particular, GBiB makes it much easier to visualize sensitive or protected data. Prior to the introduction of GBiB, it was necessary to upload your data to the UCSC Genome Browser website or place the data files on a publicly accessible web server and supply the URL to UCSC in order to view your own data with the Genome Browser. GBiB removes these requirements: none of your data must be uploaded to the UCSC servers, allowing you to use the Genome Browser on personal datasets in situations where it's infeasible to load the data onto a public web server.</p> <p> Rather than installing the entire UCSC genome annotation database (several terabytes of data), GBiB instead depends upon remote connections to various UCSC servers for much of its functionality and data. It connects to the UCSC <a href="http://hgdownload.soe.ucsc.edu/downloads.html" target="_blank">download server</a> to obtain genomic sequences, liftOver files, and many of the other large data files, and connects to one of UCSC's <a href="mysql.html">public MariaDB servers</a> to download data displayed by the various annotation tracks. A few Genome Browser tracks are unavailable on the UCSC public MariaDB servers due to agreements with the data distributors (DECIPHER and LOVD Variants), and thus are unavailable for use with GBiB.</p> <p> The majority of protected data use in the research community currently focuses on the human genomes, primarily the hg19 (GRCh37) assembly and with a growing body of annotation on the newer hg38 (GRCh38) assembly. As a result, GBiB is currently optimized for use with the hg19 assembly. Many other recent genome assemblies can also be viewed, but access may be slower than for optimized assemblies. Access speed may also be impacted by your connection distance from the UCSC server. To improve performance in these situations, GBiB includes a simple tool that allows you to download ("mirror") selected genome annotation tracks to your machine. You can find more information about this tool in the <a href="#GbibMirroring">Improving Speed and Performance</a> section.</p> <p> For more background on GBiB see: <p> Haeussler M, Raney BJ, Hinrichs AS, Clawson H, Zweig AS, Karolchik D, Casper J, Speir ML, Haussler D, Kent WJ. <a href="https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btu712" target="_blank"> Navigating protected genomics data with UCSC Genome Browser in a Box</a>. <em>Bioinformatics</em>. 2015 Mar 1;31(5):764-6. PMID: <a href="https://www.ncbi.nlm.nih.gov/pubmed/25348212" target="_blank">25348212</a>; PMC: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4341066/" target="_blank">PMC4341066</a> </p> <a name="GetStarted"></a> <h2>Getting Started: Setting up Genome Browser in a Box</h2> <h3>System requirements</h3> <p> GBiB will run on most modern PCs and major operating systems that meet these basic requirements: <ul> <li> The computer must support virtualization (common for most PCs sold after 2010).</li> <li> A compatible version of the <a href="https://www.virtualbox.org/" target="_blank">VirtualBox</a> software (version 4.3.6 or higher) must be installed. This software is free to use in many situations. See the VirtualBox wiki for <a href="https://www.virtualbox.org/wiki/Licensing_FAQ" target="_blank">licensing terms and conditions</a> and <a href="https://www.virtualbox.org/wiki/Downloads" target="_blank">installation</a> instructions. You must have administrator privileges to install VirtualBox on your computer.</li> <li> The computer hard disk must have at least 20 GB of free space (more if you plan to mirror many tracks).</li> <li> Your network firewall must allow outgoing connections to the following servers and ports: <ul> <li>MariaDB connections, used to load tracks not local to your computer: <ul> <li>US server: Port 3306 on genome-mysql.soe.ucsc.edu (128.114.119.174)</li> <li>European server: Port 3306 on genome-euro-mysql.soe.ucsc.edu (129.70.40.120)</li> </ul> </li> <li>Rsync, used to download track data: <ul> <li>US server: TCP port 873 on hgdownload.soe.ucsc.edu (128.114.119.163)</li> <li>European server: TCP port 873 on hgdownload-euro.soe.ucsc.edu (129.70.40.99)</li> </ul></li> <li>Download HTML descriptions on the fly: <ul> <li>US server: TCP port 80 on hgdownload.soe.ucsc.edu (128.114.119.163)</li> <li>European server: TCP port 80 on hgdownload-euro.soe.ucsc.edu (129.70.40.99)</li> </ul> </li> </ul> </ul> <h3>Installation</h3> <ol> <li> Confirm that your system meets the above requirements.</li> <li> Download GBiB from the <a href="https://genome-store.ucsc.edu/" target="_blank">Genome Browser store</a> (see <a href="#License">Licensing Information</a>). Due to the large size of the <em>gbib.zip</em> product file, the download time may range from 30 minutes to a few hours depending on your Internet connection speed and distance from UCSC.</li> <li> Extract the contents of the <em>gbib.zip</em> file (three files). If desired, the extracted files can be moved to a different directory on your computer, as long as all three files reside in the same directory and are not renamed. <strong>Extraction notes:</strong> On OSX, do not use the command line tool "unzip" to extract the files; instead double-click on the <em>gbib.zip</em> file in the Finder window. On Windows, you must use a third-party tool such as <a href="http://www.7-zip.org/" target="_blank">7zip</a> or <a href="http://www.rarlab.com/" target="_blank">WinRAR</a> to extract the <em>gbib.zip</em> contents, as the file size exceeds the capabilities of the standard Windows extraction tool.</li> <li> Add GBiB to VirtualBox: <ul> <li> Double-click on the <em>browserbox.vbox</em> file extracted from <em>gbib.zip</em> <strong>OR</strong></li> <li> Start VirtualBox, select <em>Machine >> Add</em>, and open the file <em>browserbox.vbox</em>.<br> <img src="../../images/gbib-add.png" alt="Adding GBiB to VirtualBox" width="570" height="162"><br></li> </ul></li> </ol> <p> If VirtualBox or the Genome Browser displays an error message during the installation process, consult the <a href="#Trouble">Troubleshooting</a> section.</p> <a name="UsingGbib"></a> <h2>Starting and using Genome Browser in a Box</h2> <h3>Starting GBiB</h3> <p> To start using GBiB with VirtualBox: <ol> <li> Start VirtualBox.</li> <li> Select "browserbox" on the VirtualBox left-hand menu (see "<a href="#GetStarted">Getting Started</a>" to add GBiB to VirtualBox).</li> <li> Click the "Start" button in the VirtualBox toolbar.<br> <img src="../../images/gbib-start.png" alt="VirtualBox manager start" width="687" height="364"></br><br> This will open a black GBiB terminal window with a Linux command line interface:<br> <img src="../../images/gbib-terminal.png" alt="GBiB terminal window" width="627" height="478"><br> <br> In most cases, GBiB will auto-update itself after starting for the first time. This update will often include database tables for RefSeq Genes and other tables that are frequently updated at UCSC. The auto-update may take several minutes to complete and the progress will be shown in the terminal window.</li> <li> Open an Internet browser window to <a href="http://127.0.0.1:1234" target="_blank">127.0.0.1:1234</a>. We recommend using this URL instead of http://localhost:1234, because most Internet browsers do not send cookies to "http://localhost", which are required to save your browser configuration between sessions. You may want to bookmark 127.0.0.1:1234 for quick future access.</li> </ol> <p> If you have correctly set up GBiB, your Internet browser should display the Genome Browser home page. From this page, you can start using GBiB as you would the public Genome Browser website. Consult the UCSC Genome Browser <a href="hgTracksHelp.html">User's Guide</a> for introductory information on using the Genome Browser tools.</p> <a name="GbibStop"></a> <h3>Stopping GBiB</h3> <p> To shut down the GBiB machine (for example, to change configuration options): <ol> <li> Close the GBiB terminal window.</li> <li> Select "Send the shutdown signal".</li> <li> Confirm by clicking "OK".</br> <img src="../../images/gbib-powerOff.png" alt="GBiB power off" width="526" height="244"></li> </ol> <a name="GbibMirroring"></a> <h2>Improving speed and performance</h2> <p> Under certain circumstances the speed and response time of GBiB may be less than optimal. This section offers suggestions for improving the performance of your GBiB installation.</p> <h3>Increasing GBiB RAM</h3> <p> As a first measure for performance improvement, try increasing the amount of RAM that GBiB is allowed to use. By default, this limit is set to 1 GB (1024 MB). If your machine has enough RAM installed, you may want to increase the GBiB RAM to 2, 4, or even 8 GB. This will usually improve the system responsiveness. To increase the RAM limit: <ol> <li> Shut down the GBiB machine (<a href="#GbibStop">Stopping GBiB</a>).</li> <li> On the VirtualBox Manager window, click <em>Settings >> System</em>.</li> <li> Move the memory slider to the value of your choice, then click "OK".<br> <img src="../../images/gbib-RAM.png" alt="GBiB RAM" width="579" height="490"></li> </ol> <h3>Using the GBiB mirror tool</h3> <p> If your GBiB performance is still slow after increasing the RAM, you may be located too far from UCSC. The load time of default tracks ranges from a few seconds on the west coast of the United States to as much as 7 seconds from Europe. In this situation, you may want to use the GBiB mirror tool to download ("mirror") tracks to your machine, which will greatly increase the access speed for those tracks.</p> <p> To use the mirror tool: <ol> <li> Click <em>Tools >> Mirror Tracks</em> in the Genome Browser menu. The first time you open the mirror tool, this page may take a while to load.</li> <li> Select the tracks that you typically use by checking the boxes next to the track names.</li> <li> Click <em>Download</em>.</li> </ol> <p> In addition to downloading entire track sets, you can also download individual subtracks. The file size of each track is listed next to the track name. If you are unsure of which tracks to select, we recommend the option <em>Default tracks with conservation tables, but no alignments</em>. When downloading large tracks, keep in mind that you cannot delete these tracks and the related data from GBiB once you have downloaded them. If you find that you've started downloading the wrong track or a track that is too large for your machine, you can cancel the download at any point by clicking <em>Cancel Download Now</em>.</p> <p> Depending on your network bandwidth, the download can take several minutes or up to a few hours over a DSL line. <strong>During the download</strong>, the file <em>gbib-data.vdi</em> will grow in size, and <strong>you will not be able to use GBiB</strong>. Once the download is complete, the default tracks should load in less than three seconds for a typical genomic position.</p> <p> If you are in your GBiB on the command-line you can use a direct rsync command for files of interest. For example, if you knew you wanted all the GENCODE tracks on hg19 you could run either of the two rsync commands for the North American or European hgdownload servers: <pre> <code>sudo rsync hgdownload.soe.ucsc.edu::mysql/hg19/wgEncodeGencode* /data/mysql/hg19/.</code> </pre> <pre> <code>sudo rsync hgdownload-euro.soe.ucsc.edu::mysql/hg19/wgEncodeGencode* /data/mysql/hg19/.</code> </pre> <p> The above commands will rsync all of the files at the UCSC hgdownload server in the hg19 assembly that start with wgEncodeGencode to your GBiB into the hg19 directory. There are some supporting files in a hgFixed directory, such as for the publication tracks, that could be mirrored with such commands.</p> <p> Here is another example for hg38 where the following commands would download all the supporting encRegTfbs and factorbook tables for the Transcription Factor ChIP-seq Clusters track: <pre> <code>sudo rsync hgdownload.soe.ucsc.edu::mysql/hg38/encRegTfbs* /data/mysql/hg38/.</code> </pre> <pre> <code>sudo rsync hgdownload.soe.ucsc.edu::mysql/hg38/factor* /data/mysql/hg38/.</code> </pre> <p> You can also download gbdb files in this manner. <pre> <code>sudo rsync hgdownload.soe.ucsc.edu::gbdb/hg19/multiz100way/phyloP100way.wib /data/gbdb/hg19/multiz100way/.</code> </pre> <pre> <code>sudo rsync hgdownload-euro.soe.ucsc.edu::gbdb/hg19/multiz100way/phyloP100way.wib /data/gbdb/hg19/multiz100way/.</code> </pre> <p>The above command would copy the phyloP100way track to display in the GBiB from a local file.</p> <h3>Offline mode</h3> <p> GBiB has an offline mode that is particularly useful when you want to ensure that GBiB no longer connects to the Internet once the initial download and setup are complete (for instance, to comply with corporate IT policy). Before going offline, first mirror all the tracks that you will want to access. Then, in the GBiB terminal window type the command: <code>gbibOffline</code>. This command will remove GBiB's network access to the UCSC MariaDB server and download servers.</p> <p> Once GBiB is in offline mode, the Genome Browser will display an error message if you attempt to access a data file not located on your local disk; therefore, we do not recommend this option for general use. To reactivate Internet access, click on the GBiB terminal window and type the command: <code>gbibOnline</code>.</p> <a name="UpdatingGbib"></a> <h2>Updating Genome Browser in a Box</h2> <p> The software that supports the UCSC Genome Browser is updated every three weeks. These updates include new features and bug fixes for existing features. The track data, on the other hand, are not updated on a regular basis. New tracks and updates to existing tracks are released as they pass UCSC's quality assurance process. The only exceptions to this are GenBank-based tracks, including RefSeq Genes, GenBank mRNAs, and others, which are updated weekly through an automatic process.</p> <h3>Automatic updates</h3> <p> By default, GBiB is configured to automatically update its files, including software and tracks that you have mirrored. The updates will not affect your custom tracks, user accounts or sessions. We recommend that you leave the auto-update process turned on if Internet connection speed is not an issue, to ensure that you receive all the latest software features and bug fixes, as well as updates to your mirrored tracks.</p> <p> If you are using a DSL line, we recommend turning off automatic updates. Over a slow internet connection, the GBiB update may take several hours to complete, during which time the software will be unusable. To turn off the auto-update process, start GBiB, then type the command <code>gbibAutoUpdateOff</code> in the GBiB terminal window.</p> <p> The auto-update process can be reactivated by typing the command <code>gbibAutoUpdateOn</code>.</p> <a name="ManualUpdates"></a> <h3>Manual updates</h3> <p> As an alternative to the auto-update process, you can manually update GBiB. To do so, start GBiB and type the command <code>updateBrowser</code> in the GBiB terminal window. This will run the script that updates the GBiB software and any annotation tracks that you have mirrored. A manual update is sometimes an effective solution when GBiB is functioning incorrectly or stops working (see the <a href="#Trouble">Troubleshooting</a> section).</p> <a name="YourTracks"></a> <h2>Viewing your own data</h2> <p> In addition to providing access to the standard set of Genome Browser annotation tracks generated by UCSC, GBiB allows you to upload your own data in the form of custom annotation tracks. These tracks can be viewed in the Genome Browser alongside the native UCSC tracks.</p> <p> Uploading your custom annotation tracks to GBiB is similar in many ways to the process used for uploading custom tracks on the public UCSC Genome Browser website. Custom tracks containing smaller data sets can be uploaded to GBiB through the <a href="../../cgi-bin/hgCustom">Add Custom Tracks</a> page. For more information on generating and uploading custom tracks, see the <a href="customTrack.html">Custom Track help page</a>. One big difference, however, is that GBiB has a built-in web server that can communicate directly with your computer. This eliminates the requirement that local big data files be hosted on a separate publicly accessible web server. As a result, your data remains private to your own computer, and will not be available to others unless you grant them access (see <a href="#LocalTracks">Loading local big data tracks and track hubs</a>).</p> <a name="LocalTracks"></a> <h3>Loading local big data tracks and track hubs</h3> <p> For improved display performance in the Genome Browser, big data sets are typically stored in a compressed, indexed binary file format such as <a href="bigBed.html">bigBed</a>, <a href="bigWig.html">bigWig</a>, <a href="bam.html">BAM</a>, or <a href="vcf.html">VCF</a> that contains the data at several resolutions. Unlike custom tracks, in which the entire data set is loaded at once, big data files transmit only the data for the region currently displayed. In order to load and display data in one of these formats, the public Genome Browser website requires big data files to be placed on a publicly accessible web server. However, because GBiB acts as its own web server, your computer can share local big data files directly with GBiB for easy uploading as a custom track.</p> <p> In addition to custom tracks, you can use <a href="hgTrackHubHelp.html">track hubs</a> and <a href="hgTrackHubHelp.html#Assembly">assembly hubs</a> to easily view your data in GBiB. Track hubs are web-accessible directories of genomic data that offer a broader set of configuration and integration options than custom annotation tracks. Assembly hubs are an extension of track hubs and allow you to specify a file containing your novel genomic sequence, in addition to custom annotation data. One strong advantage of track and assembly hubs are that they persist until you delete them, in contrast to custom tracks outside of saved sessions that will automatically expire and be removed from the server after a few days. As with big data tracks, the GBiB built-in web server circumvents the requirement that track and assembly hubs be uploaded to a public accessible server prior to viewing. For more information see <a href="hubQuickStartAssembly.html#blatGbib" target="_blank">Starting a Blat enabled Assembly Hub on GBiB</a>.</p> <p> Loading local files, such as hubs or big data files, requires that GBiB has access the to local folder containing them. To allow GBiB to access one or more of your local folders, follow these steps:</p> <div class="row"> <!-- Left column --> <div class="col-md-4"> <p> <strong>Step 1.</strong> Shut down the GBiB virtual machine (<a href="#GbibStop">Stopping GBiB</a>)</p> <p> <strong>Step 2.</strong> Allow VirtualBox access to one or more directories on your hard disk</p> <ol> <li> Click on the "browserbox" entry in the VirtualBox Manager window, then click <em>Settings</em>.</li> <li> Click on <em>Shared Folders</em>.</li> <li> Click on the small "+" icon.</li> <li> Select a directory on your disk under <em>Folder Path / Other</em>.</li> <li> Select the checkbox to give "Read-only" access and make sure the checkbox for "Auto-mount" is selected.</li> <li> Confirm by clicking "OK".</li> <li> Repeat these steps with other folders, as needed.</li> <li> When you are finished, restart GBiB by clicking the "Start" button again.</li> </ol> </div> <!-- Right column --> <div class="col-md-8"> <img src="../../images/gbib-shared.png" alt="GBiB shared" width="683" height="465"> </div> </div> <p> To check if your folders are shared, type this address into your web browser: <a href="http://127.0.0.1:1234/folders" target="_blank">http://127.0.0.1:1234/folders</a>. It should show all shared folders. To obtain the bigDataUrl of any of the files in your shared folders, right-click on any file and select "Copy link address". You can now paste this URL into the <a href="../../cgi-bin/hgCustom" target="_blank">Add Custom Tracks</a> page. Or, if you are uploading a track or assembly hub, right-click on your "hub.txt" file and select "Copy link address". You can now paste this URL into the box on the "Connected Hubs" tab of the <a href="../../cgi-bin/hgHubConnect" target="_blank">Track Data Hubs</a> page.</p> <a name="loadBam"></a> <h3><em>Example: Loading a local BAM custom track</em></h3> <p> Here is an example of a custom track in a shared test/ folder that loads a locally hosted BAM file when pasted on the custom tracks page (select <em>My Data >> Custom Tracks</em> in the Genome Browser menu): <pre><code>track type=bam name=BamExample bigDataUrl=http://127.0.0.1:1234/folders/test/bamExample.bam </code></pre> <p> To customize this URL for your own use, replace the URL with a pasted URL to files from your own machine discoverable under the <em>My Data >> GBiB Shared Data Folder</em> in the Genome Browser menu. Since the GBiB is configured so that the VirtualBox shared folder path can be placed directly into the custom track page (via a line <code>udc.localDir=/folders</code> in the hg.conf file) and there is software that knows files ending in <code>.bam</code> should be loaded as <code>type=bam</code>, the link can be replaced with just a path to the file. For example, pasting the following on the custom tracks page would also work:</p> <pre><code>/folders/test/bamExample.bam </code></pre> <a name="Connectwithssh"></a> <h3>Connecting to GBiB with ssh</h3> <p> The GBiB terminal is a normal Linux command line interface. For easier use of the command line, you can connect to the GBiB machine from your computer with ssh. This may offer better speed and more functionality, such as support for copy/paste.</p> <p> To connect to GBiB with ssh, open a terminal on your computer and type: <code>ssh browser@localhost -p 1235</code>. You will be prompted for a password when attempting to access GBiB from your computer's command line. The password is "browser". Alternatively you can use <code>sudo</code> for root access, which does not require a password. Because stock Windows computers do not have ssh installed, Windows users will have to use the GBiB terminal or install a third-party ssh client for Windows, such as the free software <a href="http://putty.org" target="_blank">Putty</a>.</p> <h3>Data and track conversion tools</h3> <p> By default, GBiB includes a few of the commonly used UCSC file manipulation tools, such as bedToBigBed, wigToBigWig, samtools and tabix. These tools can be used to convert and manipulate your basic files into formats that can be uploaded to GBiB as custom tracks. If you need additional Genome Browser tools, type the following command into the GBiB terminal window: <code>gbibAddTools</code>. This command downloads and installs the full suite of command line tools provided by UCSC. Many of these extra tools can be used to extract data and other useful information from your files, or to convert them between various file types. A complete listing and description for all of these tools can be found on UCSC's <a href="http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/FOOTER" target="_blank">download server</a>.</p> <p> You can use these tools to convert and extract data from your shared files with the standard "Read-only" settings. However, if you would like to to modify files you've shared with your GBiB, you will have to ensure that the "Read-only" access for VirtualBox is turned off. To do so, follow the directions in Step 2 of the <a href="#LocalTracks">Loading local big data tracks</a> section, but deselect the checkbox next to "Read-only".</p> <p> Please note that the gbibAddTools command requires sudo permissions.</p> <h3><em>Example: Indexing a local BAM file</em></h3> <p> A BAM file must be indexed before it can be loaded into the browser. For example, to index a BAM file in a shared folder "Documents" on your hard disk, type:</p> <pre><code>cd /folders/Documents samtools index my.sorted.bam </code></pre> <h3><em>Example: Using a format conversion tool</em></h3> <p> BED files must be converted to bigBed format to be loaded into the browser. For example, to convert a .bed file in a shared folder "Documents" on your hard disk to a .bigBed format file, type:</p> <pre><code>cd /folders/Documents fetchChromSizes hg19 > hg19.sizes bedToBigBed bedExample.txt hg19.sizes myBigBed.bb </code></pre> <h3><em>Example: Loading a GEO File</em></h3> <p> Some files at external locations, like GEO, are already in binary indexed formats that can be loaded in the browser over the Internet. However, if the server providing these files does not accept byte-range requests, they cannot be transmitted over the Internet to view in the browser. For GEO files, try the ftp location first. If you find the files of interest are giving byte-range request errors, then one option is to download the files and locally load them from your own laptop. With GBiB you can download these files to a local shared folder and then browse them. In the following example the URL of the bigWig (.bw) file was obtained for the wget by right clicking the http link for this <a href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1186795">GSM1186795</a> example, that only displays data on chrM and chrX.</p> <pre><code>ssh browser@localhost -p 1235 (to enter GBiB from your computer terminal, password: browser) cd /folders sudo wget "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM1186795&format=file&file=GSM1186795%5FDZ012%5FTRA1%5FN2L3%5Ftr27%5Fce10%5FBTm1x200n%2Ebw" -O GSM1186795.bw </code></pre> <p> The file can then be found in the GBiB's <a href="http://127.0.0.1:1234/folders/"><code>http://127.0.0.1:1234/folders/</code></a> location, or loaded with a direct link to the files location:</p> <pre><code><a href="http://127.0.0.1:1234/cgi-bin/hgTracks?db=hg19&hgt.customText=http://127.0.0.1:1234/folders/GSM1186795.bw&position=chrM" target="_blank">http://127.0.0.1:1234/cgi-bin/hgTracks?db=hg19&hgt.customText=<strong>http://127.0.0.1:1234/folders/GSM1186795.bw</strong>&position=chrM</a></code></pre> <p> (NOTE: In this above example, the <strong>ftp location</strong> now has byte-range requests supported so the <strong>GEO ftp link</strong> can be loaded on the browser).</p> <a name="CustomizingGbib"></a> <h2>Customizing your GBiB</h2> <h3>Allowing local-only assemblies</h3> <p>By modifying the MySQL table hgcentral.dbDb one can add a genome directly to the UCSC Genome Browser, as documented in <a href="mirrorManual.html#adding-a-new-custom-non-ucsc-genome-to-the-browser"> our manual mirror instructions</a>. We discourage this, as assembly hubs, especially in combination with a "default cart" (see below) are much easier to setup. However, if you have an existing local genome browser installation with genomes that only exist only there, in order to combine its "remote access" mode with a local-only genome, the local genome has to be declared in hg.conf, or for GBIB, rather /usr/local/apache/cgi-bin/hg.conf.local. The statement is <pre>slow-db.excludeDbs=assemblyName1,assemblyName2</pre> For example, if you have two local-only assemblies defined in dbDb with the names homSap1 and homSap2, the statement <pre>slow-db.excludeDbs=homSap1,homSap2</pre> will instruct the browser to never try to connect to the UCSC Public MySQL server for these two assemblies, avoiding error messages and significantly speeding up the display. </p> <!-- <a name="Fonts"></a> <h3>Enabling additional font styles</h3> <p> To enable the use of additional fonts in the tracks display, the following lines need to be need to be added to the <code>hg.conf</code> file:</p> <pre> freeType=on freeTypeDir=/usr/share/fonts/type1/gsfonts </pre> <p> The file can be found in the following location:</p> <pre> /usr/local/apache/cgi-bin/hg.conf </pre> <p> The fonts can then be change by pressing the <strong>configuration</strong> located below the image on the tracks display.</p>. --> <a name="Defaults"></a> <h3>Changing the default Genome Browser options on a GBiB</h3> <p> GBiB supports changes to some default settings such as assembly, attached hubs, fonts, text size, etc. Changing these defaults means that any time a new user goes to the mirror, or whenver all user settings are reset, the custom options will be enabled. This can be useful in cases when it is desirable to have hubs attached by default, to change the default assembly or visibility display, etc.</p> <p> The first step is to create a <a href="#UsrAcct">Session</a> containing all desired display options, hubs, etc. Then a file will be created from within the GBiB that contains the command to create a new MySQL table: <strong>defaultCart</strong>. See the <a href="#Connectwithssh">Connect with ssh</a> section of this page for help logging into your GBiB. Contents of new file <strong>defaultCart.sql</strong>:</p> <pre> #The default cart CREATE TABLE defaultCart ( contents longblob not null # cart contents ); </pre> <p> The table can then be loaded:</p> <pre> mysql hgcentral < defaultCart.sql </pre> <p> Finally, the contents of the desired session can be entered into the new table.</p> <pre> mysql hgcentral -Ne "insert into defaultCart select contents from namedSessionDb where sessionName='nameOfSession' and userName='nameOfUser'" </pre> </p> Where <strong>nameOfSession</strong> is the name of the created session and <strong>nameOfUser</strong> is the name of the user the session was created under. Keep in mind that only the top entry in that table will be used to pull default variables, so to change the new defaults (or to remove the new dedaults) you will first want to delete the table contents:</p> <pre> mysql hgcentral -Ne "delete from defaultCart" </pre> <a name="SharingGBIB"></a> <h2>Sharing Genome Browser in a Box with others</h2> <p> By default, GBiB can be accessed only from the machine on which it is installed. This is done to prevent others from accessing your data. You can, however, make your GBiB instance available for use by others. To open up external access to GBiB:</p> <ol> <li> Shut down the GBiB machine (<a href="#GbibStop">Stopping GBiB</a>).</li> <li> Select the browserbox machine in the VirtualBox left-hand menubar.</li> <li> Select <em>Machine >> Settings</em> in the VirtualBox menu to display the Settings window.</li> <li> Go to <em>Network >> Adapter 1 >> Advanced >> Port Forwarding</em>.</li> <li> Remove the address "127.0.0.1" from "Rule 1" by deleting it with the backspace key.</li> <li> Click "OK".</li> </ol> <p> <strong>Note:</strong> In addition to enabling port forwarding for VirtualBox, you may need to enable the port forwarding functionality on your PC's firewall to allow others to access your GBiB. You will have to search online for instructions on how to enable this functionality for your PC's firewall.</p> <p> Once you have opened external access to GBiB, your colleagues can access and use your GBiB instance by typing your IP address into their own Internet browser, followed by the :1234 port. <strong>Keep in mind that once you have opened up GBiB for remote access, anyone who knows your IP address will be able to access your instance of GBiB and the files that you have shared with it.</strong></p> <p> To control access by others to configure track mirroring in your shared GBiB, you can use the commands <code>gbibMirrorOff</code> and <code>gbibMirrorOn</code> to disable or enable the "Mirror Tracks" function in the menu.</p> <a name="UsrAcct"></a> <h2>User accounts and sessions</h2> <p> The <a href="../../cgi-bin/hgSession">Session</a> tool allows you to take a snapshot of your browser configured with specific track combinations, including <a href="hgTracksHelp.html#CustomTracks">custom tracks</a>. A browser session can be saved for future use or shared with others who use your GBiB instance.</p> <p> To use the Session tool, you must first create a user account. User accounts and sessions on GBiB are separate from those maintained on the UCSC Genome Browser public website. Because of this, user names and sessions that you create on GBiB cannot be used with the main UCSC Genome Browser website, and vice versa. More information on creating a user account and creating, saving, and sharing sessions can be found in the <a href="hgSessionHelp.html">Sessions User's Guide</a>.</p> <p> Username recovery on accounts is not supported at this time; however, you can recover a lost password. The system for recovering lost passwords on GBiB is much different from that on the Genome Browser and requires access to the command line. To recover a lost password:</p> <ol> <li> Navigate to the <a href="../../cgi-bin/hgLogin?hgLogin.do.displayLoginPage=1" target="blank">account login page</a>.</li> <li> Click on the "Can't access your account?" link on the login page.</li> <li> Select the "I forgot my password. Send me a new one." option and enter your username.</li> <li> An email message will be sent to the Alpine email client included with VirtualBox.</li> <li> To access the email client, click on the GBiB terminal window.</li> <li> In this window, type <code>mail</code> and press "enter", which will bring up the Alpine email client.</li> <li> Select <code>MESSAGE INDEX</code> from the menu and press enter.</li> <li> Select the message with "New temporary password..." in the subject line.</li> <li> Log in using your username and this temporary password.</li> <li> After logging in, you will be prompted to create a new password.</li> <li> Once you are finished, exit the Alpine email client by pressing "Q" and then "Y".</li> </ol> <p> <strong>Please be aware that anyone with access to your username and the command line interface of your GBiB can change your password.</strong></p> <a name="Trouble"></a> <h2>Troubleshooting common problems</h2> <p> This section addresses some common errors and problems that you may encounter while setting up and installing GBiB. It is not intended as a comprehensive list. If you experience a problem not listed below, please email the UCSC Genome Browser public support mailing list at <a href="mailto:genome@soe.ucsc.edu"> genome@soe.ucsc.edu</a>. Note that messages sent to this address are publicly accessible.</p> <p> <strong>VirtualBox Error:</strong> "<em>VT-x/AMD-V hardware acceleration has been enabled, but is not operational. Your 64-bit guest will fail to detect a 64-bit CPU and will not be able to boot.</em>"</p> <p> <em>Solution 1: </em>Some older entry-level laptops from around 2009-2011 (e.g. Toshiba Satellite U500) were sold with CPUs that do not support virtualization. These laptops cannot run GBiB. The same applies to low-cost laptops ("netbooks") with Intel Atom processors.</p> <p> <em>Solution 2: </em>On some hardware, virtualization is supported but deactivated in the BIOS. Here is one example of how virtualization support is activated on some hardware. Note that your BIOS virtualization options may differ from those described here.</p> <ol> <li> Reboot the computer and press F12 during boot to show the BIOS menu.</li> <li> Go to <em>BIOS Setup >> Virtualization Support >> Virtualization</em> and check "Enable Intel Virtualization Technology". On some Dell systems, you may need to enable additional virtualization options under the BIOS Setup. Go to <em>Virtualization Support >> Virtualization for Direct I/O</em> and check "Enable Virtualization for Direct I/O".</li> <li> Exit and save, then restart the computer.</li> </ol> <p> <strong>VirtualBox Error:</strong> "<em>Failed to open virtual machine located in... Trying to open a VM config ... which has the same UUID as an existing virtual machine.</em>"</p> <p> <em>Solution: </em> This error occurs if GBiB has been previously downloaded and installed. To resolve this problem:</p> <ol> <li> Start VirtualBox.</li> <li> Select the currently installed version of browserbox from the left-hand column.</li> <li> Select <em>Machine >> Remove</em> (Ctrl+R or ⌘+R) in the VirtualBox menu. When asked, choose "Remove only" to retain the old browserbox version on your disk.</li> <li> Double-click the newly downloaded <em>browserbox.vbox</em> file or add it with the <em>Machine >> Add</em> menu option.</li> </ol> <p> <strong>VirtualBox Error:</strong> "<em>Failed to open virtual machine located in... Cannot register the hard disk ... because a hard disk ... already exists.</em>"</p> <p> <em>Solution:</em> This error occurs if GBiB has been previously downloaded and installed. To resolve this problem:</p> <ol> <li> Start VirtualBox.</li> <li> Select <em>File >> Virtual Media Manager</em> (Ctrl+D or ⌘+D) in the VirtualBox menubar.</li> <li> Select <em>gbib-data.vdi</em> and click "Remove".</li> <li> Double-click the newly downloaded <em>browserbox.vbox</em> file or add it with <em>Machine >> Add</em> menu option.</li> </ol> <p> <strong>Genome Browser Error:</strong> "<em>Couldn't connect to database hg19 on genome-mysql.soe.ucsc.edu as genomep.</em>" <p> <em>Solution 1:</em> This indicates that the virtual machine could not connect to the UCSC MariaDB server. This error can be caused by a change of the IP address (e.g. on a wifi connection) that has not yet been picked up by the virtual machine. In this situation, you can restart the box or run the command <code>sudo ifup --force eth0</code> to reset the network connection.</p> <p> <em>Solution 2:</em> Alternatively, this error may be generated when the firewall does not allow outgoing TCP data on port 3306/MySQL. In this case, contact your institution's IT support staff to inquire about ways to open this port.</p> <p> <strong>Genome Browser Error:</strong> "<em>Couldn't connect to database hgcentral on localhost as root. Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (13)</em>"</p> <p> <em>Solution:</em> This error can be caused when the virtual machine is downloading data such as when using the <a href="#GbibMirroring">mirror tool</a> to install local copies of tracks to increase speed and performance. Once the download finishes, the message will no longer appear. If one cancels a download in progress, the error can persist, in which case restart your virtual machine and the message should be removed.</p> <p> <strong>Problem:</strong> Error when using BLAT.</p> <p> When you use BLAT on your GBiB machine, it attempts to open outgoing TCP connections to the BLAT servers running at UCSC. Each assembly has two ports that need to be open from your GBiB machine to the UCSC BLAT servers. You will likely need to contact the system administrators at your institution and ask them to open outgoing TCP connections to a list of UCSC hostnames and ports. The commands below will allow you to find out the names of these hosts and ports.</p> <p> <em>Solution:</em> To find the hostname-port combination to open from your GBiB machine, use this SQL query:</p> <pre><code>mysql hgcentral -e 'select * from blatServers where db="YOURDB"'</code></pre> <p> For example, if you want to enable BLAT for hg19 and hg38, you can issue this command: <pre><code>mysql hgcentral -e 'select * from blatServers where db="hg38" or db="hg19"'</code></pre> <pre><code>+------+---------------------+-------+---------+--------+ | db | host | port | isTrans | canPcr | +------+---------------------+-------+---------+--------+ | hg19 | blat4a.soe.ucsc.edu | 17779 | 0 | 1 | | hg19 | blat4a.soe.ucsc.edu | 17778 | 1 | 0 | | hg38 | blat4c.soe.ucsc.edu | 17781 | 0 | 1 | | hg38 | blat4c.soe.ucsc.edu | 17780 | 1 | 0 | +------+---------------------+-------+---------+--------+</code></pre> <p> <strong>Problem:</strong> GBiB is functioning incorrectly or stops working.</p> <p> <em>Solution 1:</em> If you have previously turned off automatic updates, your software or data may need updating. Try <a href="#ManualUpdates">manually updating</a> GBiB. This update process may take several hours over a slow Internet connection.</p> <p> <em>Solution 2:</em> Re-download the <em>gbib.zip</em> file and extract <strong>only</strong> the file <em>gbib-root.vdi</em>. Place this file in the same directory with the files extracted from your original GBiB installation. Do not extract and overwrite <em>gbib-data.vdi</em> -- it contains your personal track and session settings and mirrored tracks.</p> <p> <strong>Problem:</strong> I need proxy support for my files to load.</p> <p> <em>Solution:</em> Proxy servers may be required by some installations to get through a firewall. You can add the settings <code>httpProxy</code>, <code>httpsProxy</code> and <code>ftpProxy</code> to hg.conf.local at /usr/local/apache/cgi-bin/hg.conf.local:</p> <pre><code>httpProxy=http://someProxyServer:3128</code></pre> <pre><code>httpsProxy=http://someProxyServer:3128</code></pre> <pre><code>ftpProxy=ftp://127.0.0.1:2121</code></pre> <p> If the proxy server requires BASIC authentication, then the line in hg.conf.local should look like this:</p> <pre><code>httpProxy=http://user:password@someProxyServer:3128</code></pre> <pre><code>httpsProxy=http://user:password@someProxyServer:3128</code></pre> <p> If there are domains or domain-suffices that should not be proxied, use <code>noProxy</code>. <pre><code>noProxy=ucsc.edu,mit.edu,localhost,127.0.0.1</code></pre> <p> The file /usr/local/apache/cgi-bin/hg.conf should already include a line like <code>include hg.conf.local</code> to incorporate the changes in hg.conf.local.</p> <p> <strong>Problem:</strong> "No space left on device" error when running gbibAddTools.</p> <p> In older GBiB's, there is an error in the <code>gbibAddTools</code> command. To fix this command, edit the <code>/home/browser/.bashrc</code> file and change the following line: <pre><code>alias gbibAddTools='mkdir ~/bin -p; rsync -avP hgdownload.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ ~/bin/'</code></pre> </p> <p> to either of the follwing lines: <pre><code>alias gbibAddTools='sudo mkdir -p /data/tools; sudo rsync -avP hgdownload.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ /data/tools/ && ln -s /data/tools ~/bin'</code></pre> <pre><code>alias gbibAddTools='sudo mkdir -p /data/tools; sudo rsync -avP hgdownload-euro.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ /data/tools/ && ln -s /data/tools ~/bin'</code></pre> </p> <p> The updated commands will install the tools to a location with more disk space available using either the North American or European hgdownload servers. </p> <a name="Commands"></a> <h2>Genome Browser in a Box commands</h2> <!-- Explanation here --> <p> In addition to normal Linux commands, GBiB defines some special commands you may use while inside the GBiB terminal's window. These additional commands are documented in the README.txt file on the GBiB terminal's home directory, which can also be accessed via <a href="#Connectwithssh">ssh</a>.</p> <h6>General commands</h6> <table> <tr> <td><code>gbibAutoUpdateOff</code></td> <td>Switch off automatic weekly updates</td> </tr><tr> <td><code>gbibAutoUpdateOn</code></td> <td>Reactivate automatic weekly updates</td> </tr><tr> <td><code>gbibOffline</code></td> <td>Switch off remote access to UCSC for tables or files</td> </tr><tr> <td><code>gbibOnline</code></td> <td>Reactivate remote access to UCSC for tables or files</td> </tr><tr> <td><code>gbibMirrorTracksOff</code></td> <td>Disable the "Mirror tracks" tool</td> </tr><tr> <td><code>gbibMirrorTracksOn</code></td> <td>Enable the "Mirror tracks" tool</td> </tr><tr> <td><code>gbibAddTools</code></td> <td>Download the UCSC genome command line tools into the ~bin directory. Requires sudo permissions</td> </tr> </table> <h6>Advanced commands</h6> <table> <tr> <td><code>gbibCoreUpdate</code> <td>Download the most current update script from UCSC now, this is part of an automatic update</td> </tr><tr> <td><code>gbibFixMysql1</code></td> <td>Fix all MariaDB databases, fast version</td> </tr><tr> <td><code>gbibFixMysql2</code></td> <td>Fix all MariaDB databases, intensive version</td> </tr><tr> <td><code>gbibResetNetwork</code></td> <td>Reinit the eth0 network interface, in case VirtualBox dropped the network connection</td> </tr><tr> <td><code>gbibUcscLog</code></td> <td>Show a real-time log of all SQL queries on the console </td> </tr><tr> <td><code>gbibUcscTablesLog</code></td> <td>Show the tables that had to be loaded through the internet from UCSC</td> </tr><tr> <td><code>gbibUcscTablesReset</code></td> <td>Reset the table counters</td> </tr><tr> <td><code>gbibUcscGbdbLog</code></td> <td>Show the gbdb files that had to be loaded through the internet from UCSC</td> </tr><tr> <td><code>gbibUcscGbdbReset</code></td> <td>Reset the gbdb counters</td> </tr> </table> <a name="License"></a> <h2>Licensing information</h2> <p> GBiB is free for non-profit academic research and for personal use. Corporate use requires a license, setup fee and annual payment. To purchase a license or download the GBiB, visit the <a href="https://genome-store.ucsc.edu/" target="_blank">Genome Browser store</a>.</p> <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->