96c23d22e15631da16fcf1069f0f30249910b533 max Thu Oct 21 05:45:43 2021 -0700 Updating mirrorManual and mirror.html / README.txt for Mysql support retirement, refs #28373 diff --git src/product/mirrorManual.txt src/product/mirrorManual.txt index 7581646..5f1e332 100644 --- src/product/mirrorManual.txt +++ src/product/mirrorManual.txt @@ -1,133 +1,145 @@ +[comment]: <> (QA: When you are done editing this file, cd into mirrorDocs, run 'make' there and follow the instructions) + % Manual installation of the UCSC Genome Browser on a Unix server # Overview of the Genome Browser directories and databases -The genome browser requires only Apache and MySQL and uses these directories: +The genome browser requires only Apache and MariaDB and uses these directories: - static html files: we typically keep them under /usr/local/apache/htdocs and configure Apache to load them from there, to avoid conflicts with the distribution of the Linux default location /var/www/html -- MySQL databases: most of them are read-only, except the `hgcentral` database - which is read-write. Most linux distributions keep these under /var/lib/mysql +- MariaDB databases: most of them are read-only, except the `hgcentral` database + which is read-write. Most linux distributions keep these under /var/lib/mysql. + (It is possible to get the genome browser to work with MySQL after version 8, + but we do highly discourage it, as our download procedures use MyISAM .frm + files which MySQL 8 dropped.) - static genome data files in /gbdb/ -- binary CGI programs that generate images from the MySQL and /gbdb files and +- binary CGI programs that generate images from the MariaDB and /gbdb files and write them into the `trash` directory (see below). We modify our Apache config to load CGIs from /usr/local/apache/cgi-bin, so as to not conflict with the default directory of the Linux distribution - a directory for temp files called `trash`, located in the parent directory of the CGI programs, usually /usr/local/apache/trash - a small text file hg.conf in the same directory as the CGI programs, with - information on how to connect to MySQL, the location of the other directories + information on how to connect to MariaDB, the location of the other directories and various other settings, on our machines the location of this file is /usr/local/apache/cgi-bin/hg.conf -- uploaded custom data gets written by the CGI programs into MySQL databases in +- uploaded custom data gets written by the CGI programs into MariaDB databases in the database `customtrash` and also into files under /usr/local/apache/trash which is symlinked from /usr/local/apache/htdocs/trash so these files are accessible to Apache. When a web browser requests a Genome Browser page, typically /cgi-bin/hgTracks, Apache executes this CGI program. The programs then read information about how to -connect to MySQL using the file hg.conf, connects to MySQL, reads the -installed genome assemblies and the current user session from the MySQL database -hgcentral. For each genome assembly, there is a separate MySQL database (e.g. +connect to MariaDB using the file hg.conf, connects to MariaDB, reads the +installed genome assemblies and the current user session from the MariaDB database +hgcentral. For each genome assembly, there is a separate MariaDB database (e.g. hg38). Some types of data (e.g. raw genome sequences) are kept as indexed -binary files outside of MySQL, they are located in /gbdb, e.g. /gbdb/hg38. The +binary files outside of MariaDB, they are located in /gbdb, e.g. /gbdb/hg38. The location of the /gbdb directory can be changed with a setting in hg.conf. Some -types of data are not specific for a genome, these are kept in the MySQL +types of data are not specific for a genome, these are kept in the MariaDB databases hgFixed, proteome and visiGene. We strongly recommend to follow the default locations, and to place our CGI programs in `/usr/local/apache/cgi-bin`. The htdocs root directory for html files should then be in /usr/local/apache/htdocs. All Genome Browser components called from Apache get their settings from the central configuration file `/usr/local/apache/cgi-bin/hg.conf`. Among others, the location and the -username/password for the MySQL server is specified there. +username/password for the MariaDB server is specified there. To load data into the genome browser databases, you need a command line tool like hgLoadBed. These tools are distributed separately from the CGI programs. -Some tools create only MySQL tables, others write into a /gbdb subdirectory. +Some tools create only MariaDB tables, others write into a /gbdb subdirectory. Most of them require a configuration file ~/.hg.conf in your home directory -with the MySQL connection information, like server name, username and password. +with the MariaDB connection information, like server name, username and password. The data loading is done from the Unix command line and not dependent on the CGI programs that create the Genome Browser graphics. # Software Requirements To run our provided binaries: * Linux/Ubuntu/CentOS/Unix/MacOSX operating system * Apache2.x - http web server - <http://httpd.apache.org/> -* MySQL development system and libraries - <http://dev.mysql.com/> +* MariaDB development system and libraries - <http://mariadb.com/> + (MySQL 8 removed support for MyISAM schema files, which makes downloading + our data file very cumbersome and slow) * libpng runtime and development packages - <http://www.libpng.org/> * libssl runtime and development packages - <http://www.openssl.org/> * Universally Unique Identifier library - <http://e2fsprogs.sourceforge.net/> If you want to make modifications to our software, you need to compile it: * gnu gcc - C code development system - <http://www.gnu.org/software/gcc/> * gnu make - <http://www.gnu.org/software/make/> Optional: * 'ghostscript' ps to pdf converter - <http://ghostscript.com> * 'git' source code management: <http://git-scm.com/downloads> * 'gmt' map plotting tools <http://www.soest.hawaii.edu/gmt/> * 'pstack' for stack traces * 'R' for the GTex track * 'python-mysqldb' for the gene interactions track (python2) It is best to install these packages with your standard operating system package management tools: -* Debian/Ubuntu: `apt-get install ghostscript apache2 mysql-server gmt r-base uuid-dev python-mysqldb` +* Debian/Ubuntu: `apt-get install ghostscript apache2 mariadb-server gmt r-base uuid-dev python-mysqldb` * Redhat/Fedora/CentOS: `yum install libpng12 httpd ghostscript GMT hdf5 R libuuid-devel MySQL-python` +On newer distributions, python-mysqldb / MySQL-python is not available anymore. +In this case, install python2, pip for it and then use pip to install the mysql +library ("pip2 install MySQL-python"). See the file +installer/browserSetup.sh for the commands. + # Hardware and disk space requirements We currently use the following hardware to support our website: * 24 CPUs and 128Gb of memory for each of the six machines -* 16 CPUs, 64 Gb of memory for the mySQL server +* 16 CPUs, 64 Gb of memory for the MariaDB server The UCSC Genome Browser website experiences over one million hits per day. Your hardware requirements may be much less demanding and will depend upon how much traffic you expect for your mirror. Annotation database size differs a lot between the assemblies: The full size of the hg19 database in 2016 is 6 TB, for ce2 it is 5GB. It also depends on the tracks: The size of the hg19 annotations can be reduced to 2TB if you do not download any ENCODE tracks. The size of only the main gene and SNP annotations is around 5GB for hg19 and hg38. You can use the following command to get the size of the files for all of the assemblies, but it can also be modified to give the size for a particular assembly: rsync -hna --stats rsync://hgdownload.soe.ucsc.edu/gbdb/ | egrep "Number of files:|total size is" For example, to get the size of all of the files for hg19, you would use the following command: rsync -hna --stats rsync://hgdownload.soe.ucsc.edu/gbdb/hg19/ | egrep "Number of files:|total size is" After running that command, you should see output like this: Number of files: 54886 total size is 6515.70G speedup is 5181080.38 (DRY RUN) -The next command will give you the size of the entire mySQL database, +The next command will give you the size of the entire mySQL/MariaDB database, but can be changed to get the size for a particular assembly: rsync -hna --stats rsync://hgdownload.soe.ucsc.edu/mysql/ | egrep "Number of files:|total size is" # Installing the UCSC Genome browser Note: we offer Genome-Browser-in-a-Box (GBIB), a fully configured virtual machine image that can be converted for VirtualBox, VMWare, Hyper-V and other popular environments. We also offer Genome-Browser-in-the-Cloud (GBIC) an shell script that installs a genome browser in most main Linux distributions (Most Debian and Redhat-based ones, like Ubuntu and CentOS). See https://genome.ucsc.edu/goldenPath/help/mirror.html Scripts to perform all of the functions below can be found in the directory https://github.com/ucscGenomeBrowser/kent/tree/master/src/product/scripts. @@ -138,43 +150,43 @@ 1. Apache web server is installed and working, http://localhost/ provides the Apache default home page from your machine NOTE: The browser static html web pages require the Apache XBitHack option to be enabled to allow SSI <!--#include ... --> statements to function. Add 'Options +Includes' for your html directory, your httpd.conf file entry looks like: XBitHack on <Directory /usr/local/apache/htdocs> Options +Includes </Directory> You can test your Apache cgi-bin/ directory by copying the script src/product/scripts/printEnv.pl into it. -2. MySQL database is installed and working +2. MariaDB database is installed and working mysql -u browser -pgenome -e 'show tables;' mysql - MySQL can be run from the command line, and - the tables from the database mysql can be displayed. + MariaDB can be run from the command line, and + the tables from the database MariaDB can be displayed. - MySQL development package is installed (mysql-devel on RedHat) + MariaDB development package is installed (mariadb-devel on RedHat) The directory: /usr/include/mysql/ has the mysql .h files And the library: /usr/lib/mysql/libmysqlclient.a exists (your exact pathnames may vary depending upon your installation) - Set MySQL database access permissions. The examples mentioned + Set MySQL/MariaDB database access permissions. The examples mentioned in the "Mysql setup" section will allow this setup to function as described here. To setup the example user accounts as mentioned in these instructions, run the script: ex.MySQLUserPerms.sh 3. Find the location of your Apache WEB server DocumentRoot and cgi-bin directory. Typical locations are: /var/www and /usr/local/apache, /var/www/html, /var/www/cgi-bin The directory where these are located is referred to as WEBROOT in this documentation: WEBROOT=/var/www export WEBROOT @@ -267,46 +279,48 @@ for your installation. Use the comments in that file as your guide. Browser developers will want a copy of this file in their home directory with mode 600 and named: ~/.hg.conf These copies may have different db.user specification to allow developers write access to the database. 10. Load databases of interest. See also: src/product/scripts/activeDbList.sh src/product/scripts/minimal.db.list.txt src/product/scripts/loadDb.sh And discussion in scripts/README about whether you can use directly - the MySQL binary database files, or if you need to download goldenPath - database text dumps and load them into the database. + the MariaDB binary database files, or if you need to download goldenPath + database text dumps and load them into the database. If you use MariaDB, + you can use the binary files, with MySQL >= 8 you need to use dumps, + which is why we discourage the use of MySQL >= 8 with the Genome Browser. An alternative to loading the database tables from text files, - is to directly rsync the MySQL tables themselves and place them - in your MySQL /var/ directory. These tables are much larger + is to directly rsync the MariaDB tables themselves and place them + in your MariaDB /var/ directory. These tables are much larger than the text files due to the sizes of indexes created during a table load, but it can save a lot of time since the data loading step is quite compute intensive. A typical rsync command for an entire database (e.g. ce4) would be something like: rsync -avP --delete --max-delete=20 rsync://hgdownload.soe.ucsc.edu/mysql/ce4/ /var/lib/mysql/ce4/ 11. Download extra databases to work with a full genome assembly such as human/hg38: hgFixed go140213 proteins140122 sp140122 - Construct symlinks in your MySQL data directory to use database + Construct symlinks in your MariaDB data directory to use database names: go proteome uniProt for these database directories: $ ls -og proteome go uniProt lrwxrwxrwx 1 8 Feb 26 11:39 go -> go140213 lrwxrwxrwx 1 14 Mar 27 12:01 proteome -> proteins140122 lrwxrwxrwx 1 8 Mar 27 12:01 uniProt -> sp140122 $ ls -ld go140213 proteins140122 sp140122 drwx------ 2 mysql mysql 4096 Feb 26 10:57 go140213 drwx------ 2 mysql mysql 4096 Aug 19 08:08 proteins140122 drwx------ 2 mysql mysql 4096 Mar 26 13:01 sp140122 These file names are data stamped YYMMDD to indicate changes over time as they are updated with new builds of the UCSC gene track. When a new UCSC gene track is released, fetch new databases and @@ -345,112 +359,112 @@ 16. Apache configuration: To lock down your trash directory from scanning via "indexes" enter the following in your httpd.conf: <Directory "/var/www/html/trash"> Options MultiViews AllowOverride None Order allow,deny Allow from all </Directory> The specified directory name is your apache: DocumentRoot/trash e.g. /usr/local/apache/htdocs/trash -# MySQL Setup +# MariaDB Setup 1. Enable "LOAD DATA LOCAL INFILE": Set these in /etc/my.cnf or /etc/mysql/my.cnf: [mysqld] local-infile=1 [client] local-infile=1 -2. MySQL Storage Engine: +2. MariaDB Storage Engine: - In recent versions of MySQL, the default storage engine has changed from + In recent versions of MySQL/MariaDB, the default storage engine has changed from myisam to innodb. However the myisam engine should be used with the UCSC Genome Browser. Set it in /etc/my.cnf or /etc/mysql/my.cnf: [mysqld] default-storage-engine=MYISAM - Always restart your mysql server after making changes to these + Always restart your MariaDB server after making changes to these configuration files. 3. Users: There are three cases of identity to consider when providing - access to the MySQL system for the browser CGI binaries + access to the MariaDB system for the browser CGI binaries and browser developers: - 1. A MySQL user that needs read-only access to the + 1. A MariaDB user that needs read-only access to the genome databases. The browser CGI binaries require read-only access to the genome databases. - 2. A MySQL user that has write permissions to one database. + 2. A MariaDB user that has write permissions to one database. The CGI binaries require write permissions to one particular database (hgcentral) for maintaining user's cart information to store the user's browser cookie settings. - 3. A MySQL user that has general write permissions to all + 3. A MariaDB user that has general write permissions to all browser and genome databases to be used by developers - The cgi-bin binaries obtain the first two of these MySQL identities from + The cgi-bin binaries obtain the first two of these MariaDB identities from the text file: $WEBROOT/cgi-bin/hg.conf - Developers of the browser databases obtain their MySQL identities + Developers of the browser databases obtain their MariaDB identities from a text file in their home directory: ~/.hg.conf Note the initial dot in the name: .hg.conf This file in a user's directory will specify a higher-privileged user - to allow read/write access to the MySQL databases. + to allow read/write access to the MariaDB databases. This file must be set to mode 600 to provide security of the user and password to the database: $ chmod 600 ~/.hg.conf - All kent source code commands use this file to access the MySQL + All kent source code commands use this file to access the MariaDB databases. Since this file contains password information it requires the permissions to be set at 600 to keep it secret. The kent source code commands will enforce this access and not function unless it is set at 600 permissions. - Therefore you will want to create three different MySQL users + Therefore you will want to create three different MariaDB users for these purposes. The examples listed below are implemented in the shell script: src/product/scripts/ex.MySQLUserPerms.sh You can execute that script to set up these example users. An example full read/write access user: "browser", is created with the following procedure. For the following it is assumed that your root account - has access to the mysql database. You should be able + has access to the MariaDB database. You should be able to perform the following: $ export SQL_PASSWORD=mysql_root_password $ mysql -u root -p${SQL_PASSWORD} -e "show tables;" mysql - Create a MySQL user called "browser" with password - "genome" and give access to selected MySQL commands + Create a MariaDB user called "browser" with password + "genome" and give access to selected MariaDB commands for the following list of databases. When you add other databases, you will need to add these permissions to your databases. This procedure of adding permissions specifically for a set list of databases is a more secure method than allowing - the MySQL "browser" user to have access to any database. + the MariaDB "browser" user to have access to any database. ( MySQL version 5.5 requires the LOCK TABLES permission here ) ( FILE, CREATE, DROP, ALTER, LOCK TABLES, CREATE TEMPORARY TABLES on ${DB}.* ) for DB in cb1 hgcentral hgFixed hg38 proteins140122 sp140122 go140213 uniProt go proteome do mysql -u root -p${SQL_PASSWORD} -e "GRANT SELECT, INSERT, UPDATE, DELETE, \ FILE, CREATE, DROP, ALTER, CREATE TEMPORARY TABLES on ${DB}.* \ TO browser@localhost \ IDENTIFIED BY 'genome';" mysql done The above granted permissions are recommended for browser developers. The WEB browser CGI binaries need SELECT, INSERT and CREATE TEMPORARY TABLES permissions. For example, you should create a special user for @@ -462,42 +476,42 @@ do mysql -u root -p${SQL_PASSWORD} -e "GRANT SELECT \ on ${DB}.* TO \ readonly@localhost IDENTIFIED BY 'access';" mysql done Create a database to hold temporary tables: mysql -u root -p${SQL_PASSWORD} -e "create database hgTemp" mysql -u root -p${SQL_PASSWORD} -e "GRANT SELECT, INSERT, \ CREATE TEMPORARY TABLES \ on hgTemp.* TO \ readonly@localhost IDENTIFIED BY 'access';" mysql - A third MySQL user should be created with read-write access to only + A third MariaDB user should be created with read-write access to only the hgcentral database. For example, a user: "readwrite" with password: "update" for DB in hgcentral do mysql -u root -p${SQL_PASSWORD} -e "GRANT SELECT, INSERT, UPDATE, DELETE, \ CREATE, DROP, ALTER on ${DB}.* TO readwrite@localhost \ IDENTIFIED BY 'update';" mysql done - The cgi-bin binaries obtain their MySQL identities from + The cgi-bin binaries obtain their MariaDB identities from the hg.conf file in the cgi-bin directory. The file in this directory: src/product/ex.hg.conf demonstrates the use of the "readonly" user for genome database access and the "readwrite" user for hgcentral database access. 4. The hgsql command: Developers can access the browser databases via the 'hgsql' command which can be built in the source-tree at: kent/src/hg/hgsql/ This 'hgsql' command provides a convenient front-end to the standard 'mysql' command by reading the user's ~/.hg.conf file to provide access to the browser databases with the appropriate identity. Each user creates a ~/.hg.conf file (same format as the above mentioned cgi-bin/hg.conf file) @@ -512,85 +526,85 @@ file with the change of db.user, db.password, central.user, and central.password to be the fully permitted read-write user: db.user=browser db.password=genome central.user=browser central.password=genome central.db=hgcentral To test this access with your ~/.hg.conf file in place: hgsql -e "show tables;" hgcentral hgsql -e "show grants;" hgcentral -5. Configuring MySQL SSL connections (entirely optional, only needed if your IT department requires it): +5. Configuring MariaDB SSL connections (entirely optional, only needed if your IT department requires it): - MySQL is typically compiled with SSL capability from OpenSSL or yaSSL. - To see if your server supports ssl, login to mysql and run this command: + MariaDB is typically compiled with SSL capability from OpenSSL or yaSSL. + To see if your server supports ssl, login to MariaDB and run this command: mysql> show variables like '%ssl%'; +---------------+----------+ | Variable_name | Value | +---------------+----------+ | have_openssl | DISABLED | | have_ssl | DISABLED | | ssl_ca | | | ssl_capath | | | ssl_cert | | | ssl_cipher | | | ssl_crl | | | ssl_crlpath | | | ssl_key | | +---------------+----------+ - If your mysql was compiled with SSL support, which is true of virtually all mysql packages + If your MariaDB was compiled with SSL support, which is true of virtually all MariaDB packages being provided today, you can easily enable SSL by adding settings to /etc/my.cnf: ------- my.cnf: ------- [mysqld] ssl ssl-key=/somepath/server-key.pem ssl-cert=/somepath/server-cert.pem ssl-ca=/somepath/ca.pem ssl-capath=/somepath/ ssl-cipher=DHE-RSA-AES256-SHA:AES128-SHA # mysql 5.6.3 or later ssl-crl=/someCrlPath/some-crl.pem ssl-crlpath=/someCrlPath/ # mysql5.7 or later require all connections to be encrypted require_secure_transport server - After making changes to my.cnf, be sure to restart your mysqld service. + After making changes to my.cnf, be sure to restart your mariadb service. The key means private key here, and should be kept secured. The cert is a certificate acting like a public key, signed by a trusted authority (CA). If a key and cert are available, that means you can authorize. And it proves the key exists. The key is not sent to the other party. The cert is. If a ca is available it can show what certs to trust. - You do not need all the settings, but some versions of mysql + You do not need all the settings, but some versions of MariaDB do not activate SSL unless at least one of these is found: key, cert, ca, capath, cipher If you configure a key for the server or client, you will also provide its cert. We cannot teach you how to create SSL certificates here. - There are many websites including mysql that have information about + There are many websites including MariaDB that have information about making keys and certs and ca. If you just add the ssl option to the top, it will try to use SSL, or make it available. The ca is the certificate authority cert that you are using. It could be just a local self-signed authority you made up, or it can be a commercial authority like veriSign. This typically is used to sign the certificate for the server and users. The capath is a directory where ca-certs exist (OpenSSL only). The crl is a certificate revocation list. (OpenSSL only). The crlpath is a directory where revocation lists exist (OpenSSL only). This crl options are a new feature in 5.6.3, not sure it works right yet. @@ -638,54 +652,54 @@ The key and certificate for "someuser" above are signed by a ca. The verifyServerCert setting if it exists tells the client to verify that the CN field in the server's cert matches the hostname to which it is connecting. This prevents Man-In-the-Middle attacks. The caPath and crlPath options only work with OpenSSL. The example shows the most common use for the profile "db". But the SSL settings work with any profile in the hg.conf file. Of course you can stick SSL settings into your [client] section of my.cnf, but the CGIs and utils would not see them. Only mysql and hgsql would see them. - Configuring SSL requirements for mysql user accounts: + Configuring SSL requirements for MariaDB user accounts: - You can tell mysql to require SSL for a user's account like this: + You can tell MariaDB to require SSL for a user's account like this: GRANT ALL PRIVILEGES ON *.* TO 'someuser'@'%' REQUIRE SSL; - You can tell mysql to use SSL for a user's account and to + You can tell MariaDB to use SSL for a user's account and to further require the client to use their key and x509 certificate to connect by saying: GRANT ALL PRIVILEGES ON *.* TO 'someuser'@'%' REQUIRE x509; There are more-specific requirements that may be added: GRANT ALL PRIVILEGES ON *.* TO 'someuser'@'%' REQUIRE SUBJECT '/C=US/ST=CA/L=Santa Cruz/O=YourCompany/OU=YourDivision/CN=someuser/emailAddress=someuser@YourCompany.com' AND ISSUER '/C=US/ST=CA/L=Santa Cruz/O=YourCompany/OU=YourDivision/CN=YourCompanyCA/emailAddress=admin@YourCompany.com' AND CIPHER 'DHE-RSA-AES256-SHA'; You can see the cert details like this: openssl x509 -in /somepath/someuser-cert.pem -text - In later versions of MySQL, it is a requirement that the CN of the CA cert must DIFFER + In later versions of MariaDB, it is a requirement that the CN of the CA cert must DIFFER from the CN of the user and server certs. Further MySQL SSL documentation is available from <https://dev.mysql.com/doc/refman/5.6/en/creating-ssl-files-using-openssl.html> # Local Git repository (aka: "the source tree") Use the following procedures to create your own personal copy of the kent source tree where you can have your own edits that are not part of the development at UCSC. This is useful for mirror sites that have their own customizations in the source tree for local circumstances. It will also be necessary if you want to add your own tracks to your mirror (see next section). Install Git software version 1.6.2.2 or later. See the Git Community Handbook installation (<https://git-scm.com/book/en/v2/Getting-Started-Installing-Git>) and setup @@ -720,40 +734,42 @@ Updates: UCSC generally updates the origin/beta branch every three weeks. If you are updating database tables for a mirror site, we recommend that you update the source at the same time, as source code is sometimes modified to include operations on new columns that have been added to database tables. For instructions on keeping local tracks separate from UCSC Genome Browser tracks created at UCSC and mirrored from there, see the section "Adding tracks to the browser" below. # Adding your own track groups to the browser If you want to add your own tracks (see next section), you probably want to put them into a separate track group, so they are visually separated from the tracks provided by UCSC. -The MySQL table `grp` contains the list of all track groups. If you rsync the data +The MariaDB table `grp` contains the list of all track groups. If you rsync the data from UCSC on a regular schedule, the table would be overwritten each time. To avoid this, you can create an empty table with the same schema, e.g. in the database hg38: CREATE TABLE grp_local LIKE grp; -You can then use the MySQL INSERT statement to add a new track group to this +You can then use the MariaDB INSERT statement to add a new track group to this table, specify the name, label, priority and whether the group should be closed by default (most are open by default). + INSERT INTO grp_local VALUES ('test', 'This is my group', 1, 0); + Then, edit cgi-bin/hg.conf and add a line like this: db.grp=grp_local,grp This means that grp_local is added to the contents of grp and grp_local has higher priority, so you can override the UCSC-provided default groups, if needed. This will not have any effect yet. First you need to add a new track that uses your new group. You can use your new group's `name` using the "group" statement in trackDb (see the next section). All tracks with a group not in the grp table will end up in the group "Experimental" at the bottom of the page. # Adding your own tracks to the browser A track needs two items to make it exist in the browser: @@ -935,36 +951,36 @@ rsync -avP rsync://hgdownload.soe.ucsc.edu/genome/admin/hgcentral.sql . And then six tables for the latest human database. The gateway page always needs a minimum human database in order to function even if the browser is being built for the primary purpose of displaying other genomes. This default can currently be changed in the source tree in src/hg/lib/hdb.c (to be done: specify this default in hg.conf file) Start with an empty database, for example hg18: hgsql -e "create database hg18;" mysql -Again, copy the MySQL files directly from the download +Again, copy the MariaDB files directly from the download server, for example hg18: rsync -avP rsync://hgdownload.soe.ucsc.edu/mysql/hg18/ . -(beware, this is several TB of data) into your MySQL data area. Or load these tables from the text SQL +(beware, this is several TB of data) into your MariaDB data area. Or load these tables from the text SQL dumps from: rsync -avP rsync://hgdownload.soe.ucsc.edu/goldenPath/hg18/database/ . (beware, this is several TB of data) The minimal set of tables required are: grp trackDb hgFindSpec chromInfo gold gap @@ -995,110 +1011,110 @@ can integrate it into the main code base and you do not have to worry about updating them anymore. Once you have git setup properly, merging your changes into our current release should be as easy as this: git pull # get new version git checkout beta # switch to our stable branch git merge myChangesBranch # merge your changes into the beta branch make -j 20 cgi-alpha # compile and put CGIs into /usr/local/apache/cgi-bin # Custom Track Database Without any specific hg.conf configuration, custom track data is kept in flat files in the /trash/ct/ directory. -It is much more efficient to load them into a MySQL database. +It is much more efficient to load them into a MariaDB database. This article discusses the steps required to enable this function. 1. Summary configuration * database loader binaries hgLoadBed, hgLoadWiggle and wigEncode are installed in /cgi-bin/loader/ - these are installed via the normal 'make cgi' in the source tree kent/src/hg/ directory or via rsync. They are probably aleady in your cgi-bin directory. - * an empty customTrash database has been created on the MySQL host - - create this manually once, the MySQL host name is a configuration + * an empty customTrash database has been created on the MariaDB host - + create this manually once, the MariaDB host name is a configuration item, the database name customTrash is not a configuration item * temporary read-write data directory /data/tmp has been created with read/write/delete enabled for the Apache server effective user, this directory name is a configuration item * configuration items are specified in /cgi-bin/hg.conf/ - this will turn on the function * for command line access to the database, create a special ~/.hg.ct.conf to be used with the environment variable HGDB_CONF * create a cron job to run a cleaner script to expire and remove older tables from the database - dbTrash command is used for this purpose 2. Host and database name - For performance and security considerations, the MySQL host for the - custom track database can be a separate machine from the ordinary MySQL + For performance and security considerations, the MariaDB host for the + custom track database can be a separate machine from the ordinary MariaDB host that usually serves up the assembly databases or the hgcentral database. It is not required that the custom track database be on a - separate MySQL server. The specification of the host machine is placed + separate MariaDB server. The specification of the host machine is placed in the /cgi-bin/hg.conf file, for example a host machine called "ctdbhost": customTracks.host=ctdbHost The database name used on this host is fixed at customTrash which is a define in the source tree file hg/inc/customTrack.h Edit /cgi-bin/hg.conf configuration items: The following items must be specified in /cgi-bin/hg.conf to enable this function: customTracks.host=ctdbhost customTracks.user=ctdbuser customTracks.password=ctdbpasswd customTracks.useAll=yes - Establish this user account and password in MySQL with db and user + Establish this user account and password in MariaDB with db and user privileges: Select, Insert, Update, Delete, Create, Drop, Alter, Index - for example with your MySQL root user account: + for example with your MariaDB root user account: hgsql -hctdbhost -uroot -p -e \ "GRANT SELECT,INSERT,UPDATE,DELETE,CREATE,DROP,ALTER,INDEX" \ on customTrash.* TO ctdbuser@yourWebHost IDENTIFIED by 'ctdbpasswd';" mysql Optionally, a temporary read-write directory used during database loading can be specified: customTracks.tmpdir=/data/tmp The default for this is /data/tmp and should be created with read/write/delete access for the Apache server effective user. It should be on a local filesystem for best access speed, not via NFS. 3. Database loaders: The database loaders used to load custom tracks are the standard loader commands found in the source tree, hgLoadBed, hgLoadWiggle and wigEncode. They are installed into /cgi-bin/loader/ with a 'make cgi' from the source tree directory kent/src/hg/ These loaders are used by the cgi binaries hgCustom, hgTracks, and hgTables to load custom tracks into the database. They are operated in an exec'd pipeline fashion, the code details can be see in src/hg/lib/customFactory.c 4. Command line access: - Since the MySQL host may be different than your ordinary MySQL host, you + Since the MariaDB host may be different than your ordinary MariaDB host, you will need to create a unique $HOME/.ct.hg.conf file to be used in the case where you want to manipulate this separate database with the kent source tree command line tools. This unique .ct.hg.conf is merely a copy of your normal .hg.conf file but with a different host/username/password specified: db.host=ctdbhost db.user=ctdbuser db.password=ctdbpasswd central.db=hgcentral Remember to set the privileges on this hg.conf file at 600: chmod 600 $HOME/.ct.hg.conf @@ -1204,58 +1220,58 @@ It remains to be seen just how good the error reporting system is for illegal data. # Debugging the CGI binaries The typical sign of trouble is an Error 500 display in your web browser when accessing the CGI binaries, and the following message in your Apache error log: [Fri Mar 25 11:02:40 2005] [error] Premature end of script headers: hgTracks This is usually a simple configuration problem. Items to verify: 1. the hg.conf file in the cgi-bin directory specifies the correct - user names and passwords for MySQL database access. - See also the section "Mysql Setup" below. + user names and passwords for MariaDB database access. + See also the section "MariaDB Setup" below. 2. The cgi-bin directory is set to permissions 755 and not 775 or 777 When permissions are too permissive for this directory, Apache errors out with suexec permission violations. 3. Verify change history of the database hgcentral. Rarely, changes in this database require corresponding changes in the source code. Make sure your code and version of hgcentral are synchronized. Newer versions of hgcentral database with old source code are OK. The problem is when you have new source code that expects new features in hgcentral. If these items are OK, then you can check the actual operation of a cgi binary. Go to the source tree directory of the cgi binary, for example hgTracks: kent/src/hg/hgTracks In this directory, run a 'make compile' to produce a binary that is left in this directory. This binary can be run from the command line: ./hgTracks By itself with no arguments, it should produce the default tracks display HTML page for the Human genome. This assumes you have set -up your $HOME/.hg.conf file to allow access to the MySQL databases. -(See also: section "Mysql Setup"). A binary execution failure should +up your $HOME/.hg.conf file to allow access to the MariaDB databases. +(See also: section "MariaDB Setup"). A binary execution failure should be obvious at this stage of the game. If it exits because of SIGSEGV we can run it under a debugger for specifics. More on this below. If the problem is specific to a particular set of tracks being displayed, or particular genomes or options, command line arguments can be given to these CGI binaries to provide the URL inputs that a CGI binary would normally see. To prepare the binaries for operation under a debugger, go to the src/inc directory and edit the common.mk configuration file. Change "COPT=-O" to read: "COPT=-g" GNU gcc will allow "-O" with "-g", and some bugs will only exhibit themselves with -O on. However the optimizations with -O can sometimes confuse the debugger's sense of location due to optimization rearrangement of code. @@ -1541,34 +1557,34 @@ CentOS Linux on Intel or AMD Opteron environment. Other users do report successful operation on other systems. Depending upon what libraries you have installed locally, you may want to set environment variable USE_SAMTABIX. See also: http://genomewiki.ucsc.edu/index.php/Build_Environment_Variables 2. Create a directory where binaries will be moved to during the build of the source tree: $ mkdir -p ~/bin/${MACHTYPE} 3a. ALTERNATE PATH: If you are going to do a full build anyway, skip this and proceed straight to step 3 below. - Otherwise, to make a minimal utility build without mysql: + Otherwise, to make a minimal utility build without MariaDB: There are some utilities that depend only on jkweb.a and not jkhgap.a - which means they can be compiled without needing mysql client installed. - To make a utility like pslCDnaFilter without installing mysql client: + which means they can be compiled without needing MariaDB client installed. + To make a utility like pslCDnaFilter without installing MariaDB client: # create jkweb.a cd kent/src/lib make # create stringify utility required by some makefiles cd kent/src/utils/stringify make # create pslCDnaFilter utility program cd kent/src/hg/pslCDnaFilter make Proceed similarly for any other such utilities. You are done and can stop here. 3. Create the following shell environment variables: NOTE: As of mid-October 2013 the makefile build system in the source @@ -1576,35 +1592,35 @@ the mysql_config command. To use that automatic configuration, eliminate any MYSQLINC and MYSQLLIBS definitions from your shell environment and make sure mysql_config can be found in your PATH. This option is not perfect and may not function correctly. It usually will not use a static linked library which can lead to a known issue on the Mac OSX. See 'Known Problems' item 10 below. In the case the automatic configuration does not function, you can set these variables in your shell environment to override the automatic configuration: MYSQLINC=/usr/include/mysql MYSQLLIBS="/usr/lib/mysql/libmysqlclient.a -lz" export MYSQLINC MYSQLLIBS Your setting may vary depending upon where your - mysql is installed. The above example is typical. + MariaDB is installed. The above example is typical. If your system does not have this set of include files or this static client.a file, you may need to install - the mysql-devel package to obtain the mysql development - environment. (http://dev.mysql.com/downloads/) + the mariadb-devel package to obtain the MariaDB development + environment. With that package installed, this command: mysql_config --libs will display the appropriate libraries to link with for your system configuration. And: mysql_config --include will display the appropriate MYSQLINC directory. The -lz requires a link to the libraries installed in the zlib-devel rpm. 4a. Required SSL support: In order for the libraries to be able to use SSL, for instance to support fetching HTTPS URLs, install openssl. We are currently using these packages: openssl-1.0.1e openssl-devel-1.0.1e @@ -1636,31 +1652,31 @@ You can save yourself time and trouble if your Apache is somewhere other than at /usr/local/apache by creating that directory and making symlinks to your actual apache directories. For example: mkdir /usr/local/apache ln -s /var/www/cgi-bin /usr/local/apache/cgi-bin ln -s /var/www/html /usr/local/apache/htdocs ln -s /var/www/cgi-bin-${LOGNAME} /usr/local/apache/cgi-bin With those symlinks in place, a simple 'make cgi' can be used instead of the 'make compile; make install DESTDIR=...' business. If your apache DocumentRoot is something different than /usr/local/apache/htdocs then use the DOCUMENTROOT variable. This value should be a full path and should agree with the - browser.documentRoot setting in hg.conf; see the section "Mysql Setup" + browser.documentRoot setting in hg.conf; see the section "MariaDB Setup" for more details. $ make install DESTDIR=/destination/prefix CGI_BIN=/cgi-bin/path DOCUMENTROOT=/usr/local/apache/htdocs to install binaries in "/destination/prefix/cgi-bin/path" [NOTE: These 'make' commands assume gnu make is being used] 6. After source tree updates, the make sequence is: $ cd kent/src $ make clean $ make libs $ cd hg $ make compile $ make install DESTDIR=/destination/prefix @@ -1678,31 +1694,31 @@ 2. The build fails immediately in the first src/lib/ directory with the compiler issuing a warning about unused variables. Some newer versions of gcc issue these warnings and the src/inc/common.mk file specifies -Werror which causes it to exit on these warnings. Either remove the -Werror specifications in src/inc/common.mk or add the -Wno-unused-variable to instruct the compiler to allow these warnings without an exit. 3. The build fails during a link of any program under the src/hg/ hierarchy with an error something like: undefined reference to `SSL_CTX_free' undefined reference to `ERR_get_error_line_data' undefined reference to `SSL_read' undefined reference to `SSL_get_error' undefined reference to `SSL_write' - This error is due to your mysql libraries have been compiled with SSL + This error is due to your MariaDB libraries have been compiled with SSL functionality enabled. To fix this build problem, add '-lssl' to your MYSQLLIBS environment variable to satisify these SSL library functions. 4. Build fails on Macintosh with an error: aliType.c:5: warning: `rcsid' defined but not used make: *** [aliType.o] Error 1 The OSTYPE environment variable needs to be set to "darwin". If your shell is bash, it is a shell local variable instead of an environment variable as with tcsh. To avoid this error, place an export statement in your $HOME/.bashrc environment: export OSTYPE @@ -1748,31 +1764,31 @@ export PNGLIB=/opt/local/lib/libpng.a or equivalent: export PNGLIB='-L/opt/local/lib -lpng' 9. Build fails with complains about missing functions dlclose, dlopen: /usr/lib/x86_64-linux-gnu/libmysqlclient.a(client_plugin.c.o): In function `add_plugin': (.text+0x1ed): undefined reference to `dlclose' /usr/lib/x86_64-linux-gnu/libmysqlclient.a(client_plugin.c.o): In function `mysql_client_plugin_deinit': (.text+0x28b): undefined reference to `dlclose' /usr/lib/x86_64-linux-gnu/libmysqlclient.a(client_plugin.c.o): In function `mysql_load_plugin_v': (.text+0x51e): undefined reference to `dlopen' To add the library that includes these functions, add '-ldl' to the MYSQLLIBS string: MYSQLLIBS="/usr/lib/mysql/libmysqlclient.a -ldl -lz" -10. Mac OSX dynamic link with the MySQL libraries results in the following +10. Mac OSX dynamic link with the MySQL/MariaDB libraries results in the following run-time error for CGI binaries or command line programs: dyld: Library not loaded: libmysqlclient.18.dylib This is a known problem with the dynamic library for MySQL, with potential work-arounds offered: http://bugs.mysql.com/bug.php?id=61243 Or, you can set the MYSQLINC and MYSQLLIBS shell environment variables to the static MySQL library as mentioned in Step 3 above. 11. Genome browser issues error: "PDF format not available" when trying to export to pdf.