bbabbd5d2566d47d923d51dbe350634783455999 mspeir Sun Oct 26 12:14:52 2025 -0700 change soe to gi, refs #35031 diff --git src/hg/htdocs/goldenPath/help/mirrorManual.html src/hg/htdocs/goldenPath/help/mirrorManual.html index f5c3f75434d..516070ae300 100755 --- src/hg/htdocs/goldenPath/help/mirrorManual.html +++ src/hg/htdocs/goldenPath/help/mirrorManual.html @@ -177,52 +177,52 @@

Annotation database size differs a lot between the assemblies: The full size of the hg19 database in 2016 is 6 TB, for ce2 it is 5GB. It also depends on the tracks: The size of the hg19 annotations can be reduced to 2TB if you do not download any ENCODE tracks. The size of only the main gene and SNP annotations is around 5GB for hg19 and hg38.

You can use the following command to get the size of the files for all of the assemblies, but it can also be modified to give the size for a particular assembly:

-
rsync -hna --stats rsync://hgdownload.soe.ucsc.edu/gbdb/ | egrep "Number of files:|total size is"
+
rsync -hna --stats rsync://hgdownload.gi.ucsc.edu/gbdb/ | egrep "Number of files:|total size is"

For example, to get the size of all of the files for hg19, you would use the following command:

-
rsync -hna --stats rsync://hgdownload.soe.ucsc.edu/gbdb/hg19/ | egrep "Number of files:|total size is"
+
rsync -hna --stats rsync://hgdownload.gi.ucsc.edu/gbdb/hg19/ | egrep "Number of files:|total size is"

After running that command, you should see output like this:

Number of files: 54886
 total size is 6515.70G  speedup is 5181080.38 (DRY RUN)

The next command will give you the size of the entire mySQL/MariaDB database, but can be changed to get the size for a particular assembly:

-
rsync -hna --stats rsync://hgdownload.soe.ucsc.edu/mysql/ | egrep "Number of files:|total size is"
+
rsync -hna --stats rsync://hgdownload.gi.ucsc.edu/mysql/ | egrep "Number of files:|total size is"

Installing the UCSC Genome browser

Note: We offer Genome-Browser-in-the-Cloud (GBIC), an shell script that installs a genome browser in most main Linux distributions (Most Debian and Redhat-based ones, like Ubuntu and CentOS). GBIC is also available as a dockerfile. See our mirror page for more general information.

Scripts to perform all of the functions below can be found in the directory https://github.com/ucscGenomeBrowser/kent/tree/master/src/product/scripts. @@ -356,31 +356,31 @@ src/product/scripts/trashCleaner.csh

  • Download static WEB page content: See also: src/product/scripts/updateHtml.sh

  • Copy CGI binaries: This set of binaries are for x86_64 types of Linux machines. If you need to instead build binaries for your platform, follow the instructions in the section "Building the kent source tree", below.

    See also: src/product/scripts/kentSrcUpdate.sh

    -
    rsync -avP rsync://hgdownload.soe.ucsc.edu/cgi-bin/ ${WEBROOT}/cgi-bin/
  • +
    rsync -avP rsync://hgdownload.gi.ucsc.edu/cgi-bin/ ${WEBROOT}/cgi-bin/
  • Create hgcentral database and tables. This is the primary gateway database that allows the browser to find specific organism databases. See also: scripts/fetchHgCentral.sh to fetch a current copy of hgcentral.sql

    mysql -u browser -pgenome -e "create database hgcentral;"
     mysql -u browser -pgenome hgcentral < hgcentral.sql

    Please note, it is possible to create alternative hgcentral databases. For example, for test purposes. In this case use a unique name for the hgcentral database, such as "hgcentraltest", and it can be specified in the hg.conf @@ -434,31 +434,31 @@ database text dumps and load them into the database. If you use MariaDB, you can use the binary files, with MySQL >= 8 you need to use dumps, which is why we discourage the use of MySQL >= 8 with the Genome Browser.

    An alternative to loading the database tables from text files, is to directly rsync the MariaDB tables themselves and place them in your MariaDB /var/ directory. These tables are much larger than the text files due to the sizes of indexes created during a table load, but it can save a lot of time since the data loading step is quite compute intensive. A typical rsync command for an entire database (e.g. ce4) would be something like:

    -
     rsync -avP --delete --max-delete=20 rsync://hgdownload.soe.ucsc.edu/mysql/ce4/ /var/lib/mysql/ce4/
  • +
     rsync -avP --delete --max-delete=20 rsync://hgdownload.gi.ucsc.edu/mysql/ce4/ /var/lib/mysql/ce4/
  • Download extra databases to work with a full genome assembly such as human/hg38: hgFixed go140213 proteins140122 sp140122 Construct symlinks in your MariaDB data directory to use database names: go proteome uniProt for these database directories:

    $ ls -og proteome go uniProt
     lrwxrwxrwx 1  8 Feb 26 11:39 go -> go140213
     lrwxrwxrwx 1 14 Mar 27 12:01 proteome -> proteins140122
     lrwxrwxrwx 1  8 Mar 27 12:01 uniProt -> sp140122
     
     $ ls -ld go140213 proteins140122 sp140122
     drwx------ 2 mysql mysql 4096 Feb 26 10:57 go140213
     drwx------ 2 mysql mysql 4096 Aug 19 08:08 proteins140122
    @@ -1307,75 +1307,75 @@
     
  • all tables in the hgcentral database
  • six tables in the human genome
  • Create an empty hgcentral database:

    $ hgsql -e "create database hgcentral;" mysql

    Load all tables into the hgcentral database. Copy all the mysql data files from

    -
    rsync -avP rsync://hgdownload.soe.ucsc.edu/mysql/hgcentral/ .
    +
    rsync -avP rsync://hgdownload.gi.ucsc.edu/mysql/hgcentral/ .

    directly into the MySQL data area for your hgcentral database. (something usually like /var/lib/mysql/hgcentral/)

    Or load this database with mysql/hgsql commands and the hgcentral.sql text file dump of these tables from:

    -
    rsync -avP rsync://hgdownload.soe.ucsc.edu/genome/admin/hgcentral.sql .
    +
    rsync -avP rsync://hgdownload.gi.ucsc.edu/genome/admin/hgcentral.sql .

    And then six tables for the latest human database.

    The gateway page always needs a minimum human database in order to function even if the browser is being built for the primary purpose of displaying other genomes. This default can currently be changed in the source tree in src/hg/lib/hdb.c (to be done: specify this default in hg.conf file)

    Start with an empty database, for example hg18:

    hgsql -e "create database hg18;" mysql

    Again, copy the MariaDB files directly from the download server, for example hg18:

    -
    rsync -avP rsync://hgdownload.soe.ucsc.edu/mysql/hg18/ .
    +
    rsync -avP rsync://hgdownload.gi.ucsc.edu/mysql/hg18/ .

    (beware, this is several TB of data) into your MariaDB data area. Or load these tables from the text SQL dumps from:

    -
    rsync -avP rsync://hgdownload.soe.ucsc.edu/goldenPath/hg18/database/ .
    +
    rsync -avP rsync://hgdownload.gi.ucsc.edu/goldenPath/hg18/database/ .

    (beware, this is several TB of data)

    The minimal set of tables required are:

    grp
     trackDb
     hgFindSpec
     chromInfo
     gold
     gap