src/product/README.install 1.13
1.13 2010/05/05 17:41:31 galt
adding visiGene
Index: src/product/README.install
===================================================================
RCS file: /projects/compbio/cvsroot/kent/src/product/README.install,v
retrieving revision 1.12
retrieving revision 1.13
diff -b -B -U 1000000 -r1.12 -r1.13
--- src/product/README.install 12 Nov 2009 19:36:21 -0000 1.12
+++ src/product/README.install 5 May 2010 17:41:31 -0000 1.13
@@ -1,255 +1,255 @@
Installing the UCSC Genome browser
The instructions here are similar to the Mirror site instructions at:
http://genome.ucsc.edu/admin/mirror.html
with the exception of downloading the databases.
The example setup here will install a single-organism database
to get the browser started. Additional organism databases can
be added after the browser is installed and functioning.
The example mysql user here is: "browser" with a password of: "genome"
1. Confirm the following:
a. Apache WEB server is installed and working
http://localhost/
Provides the Apache default home page from your machine
NOTE: As of early March 2005, the browser static html
WEB pages require the Apache XBitHack option to be enabled
to allow <!--#include ... --> statements to function.
b. MySQL database is installed and working
$ mysql -u browser -pgenome -e 'show tables;' mysql
MySQL can be run from the command line, and
the tables from the database mysql can be displayed.
c. MySQL development package is installed (mysql-devel)
The directory: /usr/include/mysql/ has the mysql .h files
And the library: /usr/lib/mysql/libmysqlclient.a exists
(your exact pathnames may vary depending upon your installation)
2. Mount the CD-ROM if it isn't already:
mount /mnt/cdrom
Set CDROM to its path for commands to follow
CDROM=/mnt/cdrom
export CDROM
3. Set MySQL database access permissions. The examples mentioned
in the README.mysql.setup.txt instructions will allow this
setup to function as described here.
To setup the example user accounts as mentioned in these
instructions, run the script:
${CDROM}/example/MySQLUserPerms.sh
4. Find the location of your Apache WEB server DocumentRoot
and cgi-bin directory.
Typical locations are: /var/www and /usr/local/apache
/var/www/html
/var/www/cgi-bin
The directory where these are located is referred to as
WEBROOT in this documentation:
WEBROOT=/var/www
export WEBROOT
The browser WEB pages and cgi-bin binaries expect these
two directories to be next to each other in ${WEBROOT}
since referrals in html are often: "../cgi-bin"
The browser should function even if WEBROOT is in a different
directory from the primary Apache web root. In this case,
the three directories: html cgi-bin and trash should be
at the same level in this other WEBROOT. For example:
/some/other/directory/path/html/
/some/other/directory/path/cgi-bin/
/some/other/directory/path/trash/
Symlinks to the trash directory should exist from the html
and cgi-bin directory. As so:
/some/other/directory/path/html/trash -> ../trash
/some/other/directory/path/cgi-bin/trash -> ../trash
5. Create html, cgi-bin and trash directories:
mkdir ${WEBROOT}/html
mkdir ${WEBROOT}/cgi-bin
chmod 755 ${WEBROOT}/cgi-bin
(this chmod 755 will prevent suexec failures that are indicated
by "Premature end of script headers" errors in the Apache
error_log. Your cgi binaries should also be 755 permissions.)
mkdir ${WEBROOT}/trash
chmod 777 ${WEBROOT}/trash
ln -s ${WEBROOT}/trash ${WEBROOT}/html/trash
A cron job should be set to periodically clean the files in trash.
For example, the following two find commands are used at UCSC
to implement two different aging schemes, 8 hours for most files
in trash, 48 hours for custom track files:
find /trash \! \( -regex "/trash/ct/.*" -or -regex "/trash/hgSs/.*" \) \
-type f -amin +480 -exec rm -f {} \;
find /trash \( -regex "/trash/ct/.*" -or -regex "/trash/hgSs/.*" \) \
-type f -amin +2880 -exec rm -f {} \;
The browser creates .gif files in the trash directory.
The 'chmod 777' allows the Apache WEB server to write into
that directory.
6. Copy static WEB page content:
cp -Rp ${CDROM}/webrootHTML/* ${WEBROOT}/html
7. Copy CGI binaries:
cp -p ${CDROM}/built_CGI-BIN/* ${WEBROOT}/cgi-bin
These binaries were built on a Red Hat Linux release 7.3
with gcc version 2.96
If they do not function on your Red Hat system you may need
to rebuild them from the source tree.
8. Create hgcentral database and tables. This is the primary gateway
database that allows the browser to find specific organism
databases.
mysql -u browser -pgenome -e "create database hgcentral;"
mysql -u browser -pgenome hgcentral < ${CDROM}/example/hgcentral.sql
Please note, it is possible to create alternative hgcentral
databases. For example, for test purposes. In this
case use a unique name for the hgcentral database, such
as "hgcentraltest", and it can be specified in the hg.conf
file as mentioned in the next step. To create a second copy
of the hgcentral database:
mysql -u browser -pgenome -e "create database hgcentraltest;"
mysql -u browser -pgenome hgcentraltest < ${CDROM}/example/hgcentral.sql
9. Create the hg.conf file in ${WEBROOT}/cgi-bin/hg.conf
to allow the CGI binaries to find the hgcentral database
(or specifically named hgcentraltest for example)
Copy the sample hg.conf:
cp ${CDROM}/example/hg.conf ${WEBROOT}/cgi-bin
Adjust the db.user,db.password, central.user and
central.password entries.
The central.db entry specifies the hgcentral database.
Adjust this name if you are using a different name
for hgcentral. Set the central.domain to your machine's
network domain name. This will allow the browser
cookie-cart function to work.
Browser developers will want a copy of this file in
their home directory with mode 600 and named:
~/.hg.conf
These copies may have different db.user specification
to allow developers write access to the database.
10. Copy the sample C. briggsae organism database and generic
hgFixed database text files:
cp -Rp ${CDROM}/goldenPath ${WEBROOT}/html
11. Copy the gbdb data to /gbdb
The directory /gbdb is a location where the browser finds
large data files that do not conveniently fit into the
database. This directory can be a symlink to a filesystem
other than root where data space is available.
A fully loaded genome browser with all databases can occupy
more than 200 Gb of data. This example requires only 200 Mb
GBDB=/<filesystem with 200 Mb space available>
export GBDB
mkdir ${GBDB}/gbdb
ln -s ${GBDB}/gbdb /gbdb
cp -Rp ${CDROM}/gbdb /${GBDB}
12. Load the databases
-for DIR_NAME in cb1 hgFixed proteinDB/proteins040315
+for DIR_NAME in cb1 hgFixed proteinDB/proteins040315 visiGene
do
DB=`basename ${DIR_NAME}`
# Create the cb1 database and load its data
mysql -u browser -pgenome -e "create database ${DB};" mysql
cd ${WEBROOT}/html/goldenPath/${DIR_NAME}/database
for SQL in *.sql
do
T_NAME=${SQL%%.sql}
echo "loading table ${T_NAME}"
mysql -u browser -pgenome ${DB} -e "DROP TABLE ${T_NAME};"
mysql -u browser -pgenome ${DB} < ${SQL}
zcat "${T_NAME}.txt.gz" | mysql -u browser -pgenome ${DB} \
--local-infile=1 \
-e "LOAD DATA LOCAL INFILE \"/dev/stdin\" INTO TABLE ${T_NAME};"
done
done
An alternative to loading the database tables from text files,
is to directly rsync the MySQL tables themselves and place them
in your MySQL /var/ directory. These tables are much larger
than the text files due to the sizes of indexes created during a
table load, but it can save a lot of time since the data loading
step is quite compute intensive. A typical rsync command for an
entire database (e.g. hg17) would be something like:
rsync -avzP --delete --max-delete=20 \
rsync://hgdownload.cse.ucsc.edu/mysql/hg17/ \
/var/lib/mysql/hg17/
13. The browser should now appear at the URL:
http://localhost/
Check your Apache error_log file for hints to solving problems.
14. blat server setup
The blatServers table in the database hgcentral needs to
have a fully qualified host name specified in the 'host' column.
Educational and non-profit institutions are allowed to use
blat free of charge. Commercial installations of the browser
require a license for blat. See also: http://www.genomeblat.com/
and: http://genome.ucsc.edu/license/
In the source tree: src/gfServer/README.blat
15. Known problems:
Since there is no C. elegans database loaded, the chained
tracks from this C. briggsae example browser will not
click through to the C. elegans browser.
The "Convert" function link in the blue bar at the top
of the browser will not work because of missing genomes
for the Convert to work with.
The "Downloads" page: /downloads.html lists all the
genomes available at genome.ucsc.edu
Only the cb1 genome is available in this sample installation.
A cron job should be set to periodically clean the files in trash.
Note the example trash find command mentioned in section 5 above.
src/lib/log.c optionally uses syslog to support logging
by parasol and gfServer. This is not required for these
programs to function. Older implementions of syslog
may not have the vsyslog() function. If you have compile
errors, add -DNO_SYSLOG to the HG_DEFS line in inc/common.mk.
To remove the browser dependence on the required location
of the /gbdb filesystem, there is a fixup script: fixupGbdb.sh
which can be used on tables to fix filenames that are referenced
to this location. For example:
./fixupGbdb.sh hgcentral.dbDb htmlPath "/scratch/local/gbdb"
./fixupGbdb.sh hgcentral.dbDb nibPath "/scratch/local/gbdb"
./fixupGbdb.sh cb1.extFile path "/scratch/local/gbdb"
./fixupGbdb.sh ce2.phastCons file "/scratch/local/gbdb"
16. Useful links:
Links to various documentation related to the browser software:
http://genome-test.cse.ucsc.edu/eng/
There are numerous README files in the source tree on
a variety of specific subjects, e.g.:
./src/README
./src/product/README.*
./src/hg/makeDb/trackDB/README
./src/hg/makeDb/make*.doc (plain text files)
====================================================================
This file last updated: $Date$