295b9dbab341deac96c7f6166bea56f0fc7fa770 max Fri Feb 12 05:38:32 2021 -0800 changes after code review, refs #26951. Since all README files were merged into mirrorManual.txt in 2016, to avoid further confusion, I am removing them now. I manually merged all changes made since then (just 3-4 changes) into mirrorManual.txt now. mirrorManual.txt is automatically converted to https://genome.ucsc.edu/goldenPath/help/mirrorManual.html by the makefile in mirrorDocs. This makes sure that we have installation instructions that can be read with vi or less and at the same time we have a nice html file that is easy to read, looks like official documentation and is indexed by Google well. The big no.1 advantage of mirrorManual.txt is that the sections have an order in increasing complexity and the user knows where to start. The old README files had no order at all and the names were misleading (e.g. quickstart was not about a quickstart) diff --git src/product/README.customTracks.dataBase src/product/README.customTracks.dataBase deleted file mode 100644 index be03228..0000000 --- src/product/README.customTracks.dataBase +++ /dev/null @@ -1,230 +0,0 @@ - -This text is a copy of the Wiki article: - http://genomewiki.ucsc.edu/index.php/Using_custom_track_database -as of 08 March 2007 - -Sections in this document: - - 1 Using the Custom Track Database - 2 Summary configuration - 3 Host and database name - 4 /cgi-bin/hg.conf configuration items - 5 Database loaders - 6 Command line access - 7 Cleaner script - 8 metaInfo and history - 9 Turning On Considerations - 10 Use of trash files with the database on - 11 Known difficulties - -======================================================================== -Using the Custom Track Database - -A new feature of the genome browser as of March 2007 is the ability to -use a data base for custom tracks. Up to this date, custom track data -has been kept in files in the /trash/ct/ directory. This article -discusses the steps required to enable this function. - -======================================================================== -Summary configuration - - * database loader binaries hgLoadBed, hgLoadWiggle and wigEncode are - installed in /cgi-bin/loader/ - these are installed via the normal - 'make cgi' in the source tree kent/src/hg/ directory. - * an empty customTrash database has been created on the MySQL host - - create this manually once, the MySQL host name is a configuration - item, the database name customTrash is not a configuration item - * temporary read-write data directory /data/tmp has been created - with read/write/delete enabled for the Apache server effective - user, this directory name is a configuration item - * configuration items are specified in /cgi-bin/hg.conf/ - this will - turn on the function - * for command line access to the database, create a special - ~/.hg.ct.conf to be used with the environment variable HGDB_CONF - * create a cron job to run a cleaner script to expire and remove - older tables from the database - dbTrash command is used for this - purpose - -======================================================================== -Host and database name - -For performance and security considerations, the MySQL host for the -custom track database can be a separate machine from the ordinary MySQL -host that usually serves up the assembly databases or the hgcentral -database. It is not required that the custom track database be on a -separate MySQL server. The specification of the host machine is placed -in the /cgi-bin/hg.conf file, for example a host machine called -"ctdbhost": - -customTracks.host=ctdbHost - -The database name used on this host is fixed at customTrash which is a -define in the source tree file hg/inc/customTrack.h -[edit] -/cgi-bin/hg.conf configuration items - -The following items must be specified in /cgi-bin/hg.conf to enable this -function: - -customTracks.host=ctdbhost -customTracks.user=ctdbuser -customTracks.password=ctdbpasswd -customTracks.useAll=yes - -Establish this user account and password in MySQL with db and user -privileges: - -Select, Insert, Update, Delete, Create, Drop, Alter, Index -for example with your MySQL root user account: -hgsql -hctdbhost -uroot -p -e \ - "GRANT SELECT,INSERT,UPDATE,DELETE,CREATE,DROP,ALTER,INDEX" \ - on customTrash.* TO ctdbuser@yourWebHost IDENTIFIED by 'ctdbpasswd';" mysql - -Optionally, a temporary read-write directory used during database -loading can be specified: - -customTracks.tmpdir=/data/tmp - -The default for this is /data/tmp and should be created with -read/write/delete access for the Apache server effective user. -It should be on a local filesystem for best access speed, not via NFS. - -======================================================================== -Database loaders - -The database loaders used to load custom tracks are the standard loader -commands found in the source tree, hgLoadBed, hgLoadWiggle and -wigEncode. They are installed into /cgi-bin/loader/ with a 'make cgi' -from the source tree directory kent/src/hg/ These loaders are used by -the cgi binaries hgCustom, hgTracks, and hgTables to load custom tracks -into the database. They are operated in an exec'd pipeline fashion, the -code details can be see in src/hg/lib/customFactory.c - -======================================================================== -Command line access - -Since the MySQL host may be different than your ordinary MySQL host, you -will need to create a unique $HOME/.ct.hg.conf file to be used in the -case where you want to manipulate this separate database with the kent -source tree command line tools. This unique .ct.hg.conf is merely a copy -of your normal .hg.conf file but with a different host/username/password -specified: - -db.host=ctdbhost -db.user=ctdbuser -db.password=ctdbpasswd -central.db=hgcentral - -Remember to set the priviledges on this hg.conf file at 600: - -chmod 600 $HOME/.ct.hg.conf - -To enable the use of this file for subsequent command line operations, -set the environment variable HGDB_CONF to point to this file, for -example in the bash shell: - -export HGDB_CONF=$HOME/.ct.hg.conf - -With that in place, you can examine the contents of the customTrash -database: - -hgsql -e "show tables;" customTrash - -This unique hg.conf file will also be used by the cleaner command -dbTrash - -======================================================================== -Cleaner script - -The database and the temporary data directory /data/tmp need to be kept -clean. This is similar to the current cleaner script you have running on -your /trash filesystem. In this case there is a specific source tree -utility used to access and clean the database. The temporary data -directory /data/tmp would stay clean if each and every loaded custom -track was successfully loaded. In the case of badly formatted or illegal -data submitted for the custom track, the database loaders do not remove -their temporary files from /data/tmp This /data/tmp directory can be -kept clean with, for example, an hourly cron job that performs: - -find /data/tmp -type f -amin +10 -exec rm -f {} \; - -This would remove any file not accessed in the past 10 minutes. - -The database cleaner command dbTrash should be run as a cron job -encapsulated in a shell script something like this, which maintains a -record of items cleaned to enable later analysis of custom track -database usage statistics: - -#!/bin/sh - -DS=`date "+%Y-%m-%d"` -YYYY=`date "+%Y"` -MM=`date "+%m"` -export DS YYYY MM - -mkdir -p /data/trashLog/ctdbhost/${YYYY}/${MM} -RESULT="/data/trashLog/ctdbhost/${YYYY}/${MM}/${DS}.txt" -export RESULT - -/cluster/bin/x86_64/dbTrash -age=48 -drop -verbose=2 > ${RESULT} 2>&1 - -Running this once a day will remove any tables not accessed within the -past 48 hours. The dbTrash command is found in the source tree in -kent/src/hg/dbTrash - -The /trash directory can be kept clean with the following two commands, -one to implement an 8 hour expiration time on most files, the second to -implement a 48 hour expiration time on custom track files: - -find /trash \! \( -regex "/trash/ct/.*" -or -regex "/trash/hgSs/.*" \) \ - -type f -amin +480 -exec rm -f {} \; -find /trash \( -regex "/trash/ct/.*" -or -regex "/trash/hgSs/.*" \) \ - -type f -amin +2880 -exec rm -f {} \; - -======================================================================== -metaInfo and history - -You will note two special and persistent tables in the customTrash -database: metaInfo and history. The metaInfo table records a time of -last use for each custom track table and a useCount for statistics. The -time of last use is used by the cleaner utility dbTrash to expire older -tables. The history table is the same as the history table in the normal -assembly databases. The loader commands, hgLoadBed and hgLoadWiggle -record into the history table each time they load a track. The cleaner -command dbTrash also records in the history table statistics about what -it is removing. - -======================================================================== -Turning On Considerations - -Please note, if there are currently existing custom tracks in /trash/ct/ -files, at the time of adding the configuration items to -/cgi-bin/hg.conf/ those existing tracks will be converted to database -versions upon their next use by the user. Therefore, to enable this -function on the round-robin WEB servers, we will need to do the update -to /cgi-bin/hg.conf in as much a simultaneous manner as possible. -Perhaps something like a shell script to do eight background rsync's all -at the same time. - -======================================================================== -Use of trash files with the database on - -When the custom tracks database is in use, there are still small files -kept in /trash/ct which become the reference pointers to the actual -database tables belonging to that custom track. The standard trash -cleaner script should still be kept running to clean these files. - -======================================================================== -Known difficulties - -For the case of a custom track submission that contains more than one -track set of data, in the case where one of the sets of data is illegal -and causes a loading problem, even though some sets of data may have -loaded successfully, the submitting user will see an error about the -corrupted data, and they would need to correct their data submission to -get all tracks successfully loaded. - -It remains to be seen just how good the error reporting system is for -illegal data. - -====================================================================