bbabbd5d2566d47d923d51dbe350634783455999 mspeir Sun Oct 26 12:14:52 2025 -0700 change soe to gi, refs #35031 diff --git src/hg/htdocs/goldenPath/help/gbib.html src/hg/htdocs/goldenPath/help/gbib.html index 20ad73e4319..d9cbca8a9e3 100755 --- src/hg/htdocs/goldenPath/help/gbib.html +++ src/hg/htdocs/goldenPath/help/gbib.html @@ -59,31 +59,31 @@ for adiscussion of the advantages/disadvantages of the different methods to customize your genome data display.

Differences between GBiB and the Genome Browser

While GBiB and the Genome Browser are similar in many ways, there are key differences. In particular, GBiB makes it much easier to visualize sensitive or protected data. Prior to the introduction of GBiB, it was necessary to upload your data to the UCSC Genome Browser website or place the data files on a publicly accessible web server and supply the URL to UCSC in order to view your own data with the Genome Browser. GBiB removes these requirements: none of your data must be uploaded to the UCSC servers, allowing you to use the Genome Browser on personal datasets in situations where it's infeasible to load the data onto a public web server.

Rather than installing the entire UCSC genome annotation database (several terabytes of data), GBiB instead depends upon remote connections to various UCSC servers for much of its functionality and -data. It connects to the UCSC download server to obtain genomic sequences, liftOver files, and many of the other large data files, and connects to one of UCSC's public MariaDB servers to download data displayed by the various annotation tracks. A few Genome Browser tracks are unavailable on the UCSC public MariaDB servers due to agreements with the data distributors (DECIPHER and LOVD Variants), and thus are unavailable for use with GBiB.

The majority of protected data use in the research community currently focuses on the human genomes, primarily the hg19 (GRCh37) assembly and with a growing body of annotation on the newer hg38 (GRCh38) assembly. As a result, GBiB is currently optimized for use with the hg19 assembly. Many other recent genome assemblies can also be viewed, but access may be slower than for optimized assemblies. Access speed may also be impacted by your connection distance from the UCSC server. To improve performance in these situations, GBiB includes a simple tool that allows you to download ("mirror") selected genome annotation tracks to your machine. You can find more information about this tool in the Improving Speed and Performance section.

@@ -112,42 +112,42 @@

A compatible version of the VirtualBox software (version 4.3.6 or higher) must be installed. This software is free to use in many situations. See the VirtualBox wiki for licensing terms and conditions and installation instructions. You must have administrator privileges to install VirtualBox on your computer.

The computer hard disk must have at least 20 GB of free space (more if you plan to mirror many tracks).

Your network firewall must allow outgoing connections to the following servers and ports:

MariaDB connections, used to load tracks not local to your computer:
- US server: Port 3306 on genome-mysql.soe.ucsc.edu (128.114.119.174)
- US server: Port 3306 on genome-mysql.gi.ucsc.edu (128.114.119.174)
- European server: Port 3306 on genome-euro-mysql.soe.ucsc.edu (129.70.40.120)
Rsync, used to download track data:
- US server: TCP port 873 on hgdownload.soe.ucsc.edu (128.114.119.163)
- US server: TCP port 873 on hgdownload.gi.ucsc.edu (128.114.119.163)
- European server: TCP port 873 on hgdownload-euro.soe.ucsc.edu (129.70.40.99)
Download HTML descriptions on the fly:
- US server: TCP port 80 on hgdownload.soe.ucsc.edu (128.114.119.163)
- US server: TCP port 80 on hgdownload.gi.ucsc.edu (128.114.119.163)
- European server: TCP port 80 on hgdownload-euro.soe.ucsc.edu (129.70.40.99)

Installation

Confirm that your system meets the above requirements.

Download GBiB from the Genome Browser store (see Licensing Information). Due to the large size of the gbib.zip product file, the download time may range from 30 minutes to a few hours depending on your @@ -273,53 +273,53 @@ we recommend the option Default tracks with conservation tables, but no alignments. When downloading large tracks, keep in mind that you cannot delete these tracks and the related data from GBiB once you have downloaded them. If you find that you've started downloading the wrong track or a track that is too large for your machine, you can cancel the download at any point by clicking Cancel Download Now.

Depending on your network bandwidth, the download can take several minutes or up to a few hours over a DSL line. During the download, the file gbib-data.vdi will grow in size, and you will not be able to use GBiB. Once the download is complete, the default tracks should load in less than three seconds for a typical genomic position.

If you are in your GBiB on the command-line you can use a direct rsync command for files of interest. For example, if you knew you wanted all the GENCODE tracks on hg19 you could run either of the two rsync commands for the North American or European hgdownload servers:

-sudo rsync hgdownload.soe.ucsc.edu::mysql/hg19/wgEncodeGencode* /data/mysql/hg19/.
+sudo rsync hgdownload.gi.ucsc.edu::mysql/hg19/wgEncodeGencode* /data/mysql/hg19/.

 sudo rsync hgdownload-euro.soe.ucsc.edu::mysql/hg19/wgEncodeGencode* /data/mysql/hg19/.

The above commands will rsync all of the files at the UCSC hgdownload server in the hg19 assembly that start with wgEncodeGencode to your GBiB into the hg19 directory. There are some supporting files in a hgFixed directory, such as for the publication tracks, that could be mirrored with such commands.

Here is another example for hg38 where the following commands would download all the supporting encRegTfbs and factorbook tables for the Transcription Factor ChIP-seq Clusters track:

-sudo rsync hgdownload.soe.ucsc.edu::mysql/hg38/encRegTfbs* /data/mysql/hg38/.
+sudo rsync hgdownload.gi.ucsc.edu::mysql/hg38/encRegTfbs* /data/mysql/hg38/.

-sudo rsync hgdownload.soe.ucsc.edu::mysql/hg38/factor* /data/mysql/hg38/.
+sudo rsync hgdownload.gi.ucsc.edu::mysql/hg38/factor* /data/mysql/hg38/.

You can also download gbdb files in this manner.

-sudo rsync hgdownload.soe.ucsc.edu::gbdb/hg19/multiz100way/phyloP100way.wib /data/gbdb/hg19/multiz100way/.
+sudo rsync hgdownload.gi.ucsc.edu::gbdb/hg19/multiz100way/phyloP100way.wib /data/gbdb/hg19/multiz100way/.

 sudo rsync hgdownload-euro.soe.ucsc.edu::gbdb/hg19/multiz100way/phyloP100way.wib /data/gbdb/hg19/multiz100way/.

The above command would copy the phyloP100way track to display in the GBiB from a local file.

Offline mode

GBiB has an offline mode that is particularly useful when you want to ensure that GBiB no longer connects to the Internet once the initial download and setup are complete (for instance, to comply with corporate IT policy). Before going offline, first mirror all the tracks that you will want to access. Then, in the GBiB terminal window type the command: gbibOffline. This command will remove GBiB's network access to the UCSC MariaDB server and download servers.

@@ -486,31 +486,31 @@ sudo for root access, which does not require a password. Because stock Windows computers do not have ssh installed, Windows users will have to use the GBiB terminal or install a third-party ssh client for Windows, such as the free software Putty.

Data and track conversion tools

By default, GBiB includes a few of the commonly used UCSC file manipulation tools, such as bedToBigBed, wigToBigWig, samtools and tabix. These tools can be used to convert and manipulate your basic files into formats that can be uploaded to GBiB as custom tracks. If you need additional Genome Browser tools, type the following command into the GBiB terminal window: gbibAddTools. This command downloads and installs the full suite of command line tools provided by UCSC. Many of these extra tools can be used to extract data and other useful information from your files, or to convert them between various file types. A complete listing and description for all of these tools can be found on UCSC's -download +download server.

You can use these tools to convert and extract data from your shared files with the standard "Read-only" settings. However, if you would like to to modify files you've shared with your GBiB, you will have to ensure that the "Read-only" access for VirtualBox is turned off. To do so, follow the directions in Step 2 of the Loading local big data tracks section, but deselect the checkbox next to "Read-only".

Please note that the gbibAddTools command requires sudo permissions.

Example: Indexing a local BAM file

A BAM file must be indexed before it can be loaded into the browser. For example, to index a BAM file in a shared folder "Documents" on your hard disk, type:

cd /folders/Documents
@@ -779,31 +779,31 @@
 resolve this problem:
 
   
   Start VirtualBox.
   
   Select File >> Virtual Media Manager (Ctrl+D or ⌘+D) in the VirtualBox 
   menubar.
   
   Select gbib-data.vdi and click "Remove".
   
   Double-click the newly downloaded browserbox.vbox file or add it with 
   Machine >> Add menu option.
 
 
 Genome Browser Error: "Couldn't connect to database hg19 on 
-genome-mysql.soe.ucsc.edu as genomep."
+genome-mysql.gi.ucsc.edu as genomep."
 

 Solution 1: This indicates that the virtual machine could not connect to the UCSC MariaDB 
 server. This error can be caused by a change of the IP address (e.g. on a wifi connection) that has 
 not yet been picked up by the virtual machine. In this situation, you can restart the box or run 
 the command sudo ifup --force eth0 to reset the network connection.
 
 Solution 2: Alternatively, this error may be generated when the firewall does not allow 
 outgoing TCP data on port 3306/MySQL. In this case, contact your institution's IT support staff to 
 inquire about ways to open this port.
 
 Genome Browser Error: "Couldn't connect to database hgcentral on localhost
 as root. Can't connect to local MySQL server through socket 
 '/var/lib/mysql/mysql.sock' (13)"
 
 Solution: This error can be caused when the virtual machine is downloading data such as
@@ -859,35 +859,35 @@
 this:
 httpProxy=http://user:password@someProxyServer:3128
 httpsProxy=http://user:password@someProxyServer:3128
 
 If there are domains or domain-suffices that should not be proxied, use noProxy.
 
noProxy=ucsc.edu,mit.edu,localhost,127.0.0.1
 
 The file /usr/local/apache/cgi-bin/hg.conf should already include a line like include
 hg.conf.local to incorporate the changes in hg.conf.local.
 
 
 Problem: "No space left on device" error when running gbibAddTools.
 
 In older GBiB's, there is an error in the gbibAddTools command. To fix this command, 
 edit the /home/browser/.bashrc file and change the following line:
-
alias gbibAddTools='mkdir ~/bin -p; rsync -avP hgdownload.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ ~/bin/'
+alias gbibAddTools='mkdir ~/bin -p; rsync -avP hgdownload.gi.ucsc.edu::genome/admin/exe/linux.x86_64/ ~/bin/'
 
 
 to either of the follwing lines:
-
alias gbibAddTools='sudo mkdir -p /data/tools; sudo rsync -avP hgdownload.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ /data/tools/ && ln -s /data/tools ~/bin'
+alias gbibAddTools='sudo mkdir -p /data/tools; sudo rsync -avP hgdownload.gi.ucsc.edu::genome/admin/exe/linux.x86_64/ /data/tools/ && ln -s /data/tools ~/bin'
 alias gbibAddTools='sudo mkdir -p /data/tools; sudo rsync -avP hgdownload-euro.soe.ucsc.edu::genome/admin/exe/linux.x86_64/ /data/tools/ && ln -s /data/tools ~/bin'
 
 
 The updated commands will install the tools to a location with more disk space available using
 either the North American or European hgdownload servers.
 
 
 
 Genome Browser in a Box commands
 
 
 In addition to normal Linux commands, GBiB defines some special commands you may use while inside 
 the GBiB terminal's window. These additional commands are documented in the README.txt file on the 
 GBiB terminal's home directory, which can also be accessed via