32ffc191ad2340e01474998c9d51f0e141d42e7b ccpowell Tue Jul 16 11:45:11 2019 -0700 Adding MariaDB to explain we use a MariaDB database in place of MySQL, refs #23597 diff --git src/hg/htdocs/goldenPath/help/gbic.html src/hg/htdocs/goldenPath/help/gbic.html index 7324445..b560f59 100755 --- src/hg/htdocs/goldenPath/help/gbic.html +++ src/hg/htdocs/goldenPath/help/gbic.html @@ -21,31 +21,31 @@ <a name='what-is-genome-browser-in-the-cloud'></a> <h2>What is Genome Browser in the Cloud?</h2> <p> The Genome Browser in the Cloud (GBiC) program is a convenient tool that automates the setup of a UCSC Genome Browser mirror. The GBiC program is for users who want to set up a full mirror of the UCSC Genome Browser on their server/cloud instance, rather than using <a href='gbib.html' title=''>Genome Browser in a Box</a> (GBIB) or our public website. Please see the <a href='mirror.html#considerations-before-installing-a-genome-browser' title=''>Installation of a UCSC Genome Browser on a local machine (mirror)</a> page for a summary of installation options, including the pros and cons of using a mirror installation via the GBiC program vs. using GBiB. </p> <p> -The program works by setting up MySQL, Apache, and Ghostscript, and then copying the Genome +The program works by setting up MySQL (MariaDB), Apache, and Ghostscript, and then copying the Genome Browser CGIs onto the machine under <code>/usr/local/apache/</code>. Because it also deactivates the default Apache htdocs/cgi folders, it is best run on a new machine, or at least a host that is not already used as a web server. The tool can also download full or partial assembly databases, update the Genome Browser CGIs, and remove temporary files (aka "trash cleaning"). </p> <p> The GBiC program has been tested with Ubuntu 14/16 LTS, Centos 6/6.7/7.2, and Fedora 20. </p> <p> It has also been tested on virtual machines in Amazon EC2 (Centos 6 and Ubuntu 14) and Microsoft Azure (Ubuntu). If you want to load data on the fly from UCSC, you need to select the data centers "US West (N. California)" (Amazon) or "West US" (Microsoft) for best performance. Other data centers (e.g. East Coast) will require a local copy of the genome assembly, which @@ -55,129 +55,129 @@ <a name='quick-start-instructions'></a> <h2>Quick Start Instructions</h2> <p> Download the GBiC program from the <a href='https://genome-store.ucsc.edu/' title=''>UCSC Genome Browser store</a>. </p> <p> Run the program as root, like this: </p> <pre><code>sudo bash browserSetup.sh install</code></pre> <p> -The <code>install</code> command downloads and configures Apache, MySQL and Ghostscript, copies the Genome Browser +The <code>install</code> command downloads and configures Apache, MySQL (MariaDB)and Ghostscript, copies the Genome Browser CGIs, and configures the mirror to load data remotely from UCSC. The <code>install</code> command must be run before any other command is used. </p> <p> For mirror-specific help, please contact the Mirror Forum as listed on our <a href='../../contacts.html' title=''>contact page</a>. </p> <p> For an installation demonstration, see the <a href='https://www.youtube.com/watch?v=dcJERBVnjio' title=''>Genome Browser in the Cloud (GBiC) Introduction</a> video: </p> <p> <iframe width="560" height="315" src="https://www.youtube.com/embed/dcJERBVnjio?rel=0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> </p> <a name='how-does-the-gbic-program-work'></a> <h2>How does the GBiC program work?</h2> <p> -The GBiC program downloads the Genome Browser CGIs and sets up the central MySQL database. All +The GBiC program downloads the Genome Browser CGIs and sets up the central MySQL (MariaDB) database. All potentially destructive steps require confirmation by the user (unless the <code>-b</code> batch mode option is specified). </p> <p> -In particular, MySQL and Apache are installed and set up with the right package +In particular, MySQL (MariaDB) and Apache are installed and set up with the right package manager (yum or apt-get). A default random password is set for the -MySQL root user and added to the <code>~/.my.cnf</code> file of the Unix root account. -If you have already set up MySQL, you must create the +MySQL (MariaDB) root user and added to the <code>~/.my.cnf</code> file of the Unix root account. +If you have already set up MySQL (MariaDB), you must create the <code>~/.my.cnf</code> file. The program will detect this and create a template file for you. The program also performs some minor tasks such as placing symlinks, detecting MariaDB, deactivating SELinux, finding the correct path for your Apache install -and adapting the MySQL socket config. +and adapting the MySQL (MariaDB) socket config. </p> <p> This will result in a Genome Browser accessible on localhost that loads its data through genome-mysql.soe.ucsc.edu:3306 and hgdownload.soe.ucsc.edu:80. If your geographic location is not on the US West Coast, the performance will be too slow for normal -use, though sufficient to test that the setup is functional. A special MySQL server is +use, though sufficient to test that the setup is functional. A special MySQL (MariaDB) server is set up in Germany for users in Europe. You can change the <code>/usr/local/apache/cgi-bin/hg.conf</code> genome-mysql.soe.ucsc.edu lines to genome-euro-mysql.soe.ucsc.edu in order to get better performance. You can then use the program to download assemblies of interest to your local Genome Browser, which will result in performance at least as fast as the UCSC site. </p> <h3>Network requirements</h3> <p> Your network firewall must allow outgoing connections to the following servers and ports: <ul> - <li>MySQL connections, used to load tracks not local to your computer: + <li>MySQL (MariaDB) connections, used to load tracks not local to your computer: <ul> <li>US server: Port 3306 on genome-mysql.soe.ucsc.edu (128.114.119.174)</li> <li>European server: Port 3306 on genome-euro-mysql.soe.ucsc.edu (129.70.40.120)</li> </ul></li> <li> Rsync, used to download track data: <ul> <li>US server: TCP port 873 on hgdownload.soe.ucsc.edu (128.114.119.163)</li> <li>European server: TCP port 873 on hgdownload-euro.soe.ucsc.edu (129.70.40.99)</li> </ul></li> <li>Download HTML descriptions on the fly: <ul> <li>US server: TCP port 80 on hgdownload.soe.ucsc.edu (128.114.119.163)</li> <li>European server: TCP port 80 on hgdownload-euro.soe.ucsc.edu (129.70.40.99)</li> </ul></li> </ul></p> <a name='gbic-commands'></a> <h2>GBiC Commands</h2> <p> The first argument of the program is called <code>command</code> in the following section of this document. The first command that you will need is <code>install</code>, which installs the Genome Browser dependencies, -binary files and basic MySQL infrastructure: +binary files and basic MySQL (MariaDB) infrastructure: </p> <pre><code>sudo bash browserSetup.sh install</code></pre> <p> There are a number of options supported by the GBiC program. In all cases, options must be specified before the command. </p> <p> The following example correctly specifies the batch mode option to the program: </p> <pre><code>sudo bash browserSetup.sh -b install</code></pre> <p> To improve the performance of your Genome Browser, the program accepts the command <code>minimal</code>. It will download the minimal tables required for reasonable performance from places in the US and possibly others, e.g., from Japan. Call it like this to trade space for performance and download a few -of the most used MySQL tables for hg38: +of the most used MariaDB tables for hg38: </p> <pre><code>sudo bash browserSetup.sh minimal hg38</code></pre> <p> If the Genome Browser is still too slow, you will have to mirror all tables of a genome assembly. By default, rsync is used for the download. Alternatively you can use UDR, a UDP-based fast transfer protocol (option: <code>-u</code>). </p> <pre><code>sudo bash browserSetup.sh -u mirror hg38</code></pre> <p> A successful run of <code>mirror</code> will also cut the connection to UCSC: no tables or files are downloaded on-the-fly anymore from the UCSC servers. To change