src/product/scripts/README 62b1428915cc2c8b9acbf46a86c1248833257efe

62b1428915cc2c8b9acbf46a86c1248833257efe
mspeir
  Mon Nov 17 18:50:32 2025 -0800
changing rest of genome-source references to github, refs #34485

diff --git src/product/scripts/README src/product/scripts/README
index f46028f0c0c..4dacd3eb9c0 100644
--- src/product/scripts/README
+++ src/product/scripts/README
@@ -1,150 +1,150 @@
 #
 # This file can be viewed at the following URL:
-# http://genome-source.soe.ucsc.edu/gitlist/kent.git/raw/master/src/product/scripts/README
+# http://github.com/ucscGenomeBrowser/kent/raw/master/src/product/scripts/README
 #
 
 Included here are some scripts to help with the job of constructing
 and maintaining a mirror site for the UCSC Genome Browser.
 Copy this directory contents to a work directory of your choice
 outside of the kent source tree so your scripts will remain
 stable despite source tree updates.
 
 These scripts are helpful and useful, but they are not the entire
 story.  You need to understand what they are doing to utilize
 them appropriately.
 
 They expect that commands for rsync, git, and mysql, for example,
 are installed on your system.  Instructions for the installation
 of MySQL and the Apache WEB server are not included here.
 Those products have excellent How-To resources on the internet.
 MySQL: http://dev.mysql.com/doc/refman/5.1/en/index.html
 Apache: http://httpd.apache.org/
 
 You must copy and edit the specification file:
 
 	browserEnvironment.txt
 
 to determine where the various objects are going to exist on
 your system.  The examples in that file are sending files to
 /scratch/tmp/ which is probably not very useful.  Please copy
 this directory to a directory of your choice outside of the
 source tree and edit the browserEnvironment.txt file.
 The pathname to that file will be used with these scripts so
 they understand how to function in your environment.
 
 What is the difference between "full" and "minimal" in the scripts ?
 A "minimal" set of files for a given UCSC genome database are
 sufficient to get a genome browser up and running with no
 extra tracks.  This genome browser will have no annotations.
 The "full" set of files for a given genome browser are all files
 required to operate that genome browser with all annotation tracks
 available.  The sizes of the larger databases are becoming more
 that 1 Tb.
 
 The update race condition ==========================================
 
 This problem can be avoided if you rsync the MySQL binary data
 files directly into your MySQL database.  There isn't a script
 provided here to do this since it is just a periodic rsync.
 That can be a simple several line shell script.  For example, to
 fetch a minimal set of tables for the hg19 browser directly
 into the MySQL database:
 
 for TBL in cytoBand cytoBandIdeo chromInfo gold gap grp trackDb hgFindSpec
 do
   for EXT in frm MYD MYI
   do
 rsync -a -P rsync://hgdownload.soe.ucsc.edu/mysql/hg19/${TBL}.${EXT} \
 	/var/mysql/hg19/
   done
 done
 
 A database such as hg18 with individual tables for each chromosome would
 have an extra rsync to help pick up those tables:
 
 for TBL in cytoBand cytoBandIdeo chromInfo gold gap grp trackDb hgFindSpec
 do
   for EXT in frm MYD MYI
   do
 rsync -a -P rsync://hgdownload.soe.ucsc.edu/mysql/hg19/${TBL}.${EXT} \
 	/var/mysql/hg19/
 rsync -a -P rsync://hgdownload.soe.ucsc.edu/mysql/hg19/chr*_${TBL}.${EXT} \
 	/var/mysql/hg19/
   done
 done
 
 Please note, as of November 2011, there is a second rsync server
 you can use in place of the hgdownload mentioned above.  Depending
 upon your internet location, the second server may provide
 better transfer speed.  To use the second rsync server, use
 the name hgdownload-sd in place of the hgdownload in the above commands.
 
 If you instead need to fetch the goldenPath database text
 dumps and load them, please note the following discussion.
 
 There is a tricky coordination problem that must be considered.
 The updates of the MySQL text dumps in goldenPath/*/database/ are
 uncoordinated with when your database tables were last loaded.
 An attempt to coordinate this situation is included with the
 tagGoldenPath.sh script and the loadUpdateGoldenPath.sh script.
 If you know that all the databases are loaded completely from your existing
 set of goldenPath/*/database/ file, you can use the tagGoldenPath.sh
 script to tag those database/ directories with a lastTimeStamp of
 the newest file existing there.  When an update rsync takes place,
 any new files will be newer than that lastTimeStamp.  The
 loadUpdateGoldenPath.sh script will load or reload only the
 new files since the lastTimeStamp.  If that load is successful,
 then the tagGoldenPath.sh can be run and a new lastTimeStamp will
 be established.  As long as the loading and reloading are successful,
 there will be no difficulty with this coordination.  In the worst case,
 if database tables and download dumps are unknown to be in sync
 or not, a full reload of the database should be undertaken.  A full
 reload of a database can consume a number of hours.
 
 Brief description of these scripts.  Run them with no arguments
 to view their full help messages:
 
 1. activeDbList.sh - using MySQL commands to the public UCSC MySQL
 	server, will fetch a list of active databases.  Lists of
 	databases can be used with some of the fetch and load scripts.
 2. minimal.db.list.txt - a list of active databases as of mid-March 2010
 3. kentSrcUpdate.sh - fetches and/or updates a local copy of the 'kent'
 	source tree and builds that source tree with resulting binaries
 	into directories of your specification in your copy of the
 	browserEnvironment.txt file.  Can be run as a cron job about every two
 	weeks to keep your source tree, CGI binaries, and kent utilities
 	up to date.
 4. fetchHgCentral.sh - fetches a copy of the hgcentral.sql definition of
 	the hgcentral database
 5. updateHtml.sh - rsync fetch the static HTML hierarchy of
 	the UCSC Genome Browser into your specified documentRoot
 6. fetchMinimalGbdb.sh - rsync fetch of a minimal set of files in the /gbdb/
 	hierarchy to operate a genome browser with the smallest set of files
 7. fetchMinimalGoldenPath.sh - rsync fetch of a minimal set of files in the
 	goldenPath/ hierarchy to be used to load a genome datase
 8. loadDb.sh - can perform an initial load of a database from the
 	goldenPath/ database MySQL text dumps
 9. fetchFullGbdb.sh - rsync fetch of all files for a given database into
 	the local /gbdb/ hierarchy
 10. fetchFullGoldenPath.sh - rsync fetch of all files for a given database
 	into the local goldenPath/ database MySQL text dump directory
 11. tagGoldenPath.sh - after everything is loaded successfully in
 	goldenPath/*/database this script will touch a lastUpdate file
 	to be used by loadUpdateGoldenPath.sh
 12. loadUpdateGoldenPath.sh - after goldenPath/*/database download are updated,
 	this script will load any new tables since the last timeStamp
 13. dbTrashCleaner.sh - run as a cron job periodically to clean
 	database tables in the customTrash database
 14. trashCleaning.sh - run as a cron job periodically to clean files
 	in your genome browser .../trash/ directory.
 15. allDatabaseUpdate.sh - this attempts to put all db updates together into
 	one script.  It runs the four different fetch scripts and the
 	database update script.  It needs to have lists of databases
 	to update minimally or fully.  The logs of update activities need
 	to be checked to verify it is working OK.
 16. printEnv.pl - script to verify your cgi-bin directory is functioning
 	correctly.  Note comments in that script.
 
 ====================================================================
 This file last updated: 2010/10/27 10:51:59
 ====================================================================