8cc9219d0990c270849cba2159a27ad9a95a1f08 kent Tue Oct 1 17:29:17 2024 -0700 Adding bits about installing bowtie and indexes. Upping validation threads. diff --git src/hg/cirm/cdw/install/README src/hg/cirm/cdw/install/README index 7dbe93e..5fbcadf 100644 --- src/hg/cirm/cdw/install/README +++ src/hg/cirm/cdw/install/README @@ -1,84 +1,86 @@ Installation overview for Cell Data Warehouse (CDW). 1) First bring up a recent linux distribution. 2) Run a Genome Browser Install script that should include mysql/mariaDb install. 3) Add in (with dnf install) libpng-devel, libuuid-devel, mariadb-devel, nodejs 4) Create a cdw user at the Unix level that will be the one to run the daemons. 5) Create the following directories and make them writable and executable by the cdw user. $HOME/cdw/bin/scripts $HOME/cdw/bin/x86_64 $HOME/cdw/backups /usr/local/apache/cgi-bin-cdw Also set up the environmental variable CIRM to point to $HOME/cdw 6) Log in as the cdw user 7) Get the Kent UCSC source and execute the following makes from the src directory: cd ~/kent/src make -j 16 libs cd ~/kent/src/hg/cirm/cdw make -j 16 cd ~/kent/src/tagStorm make -j 16 cd ~/kent/src/hg/hgsql make cd ~/kent/src/hg/encode3/encodeDataWarehouse/utils/edwSamPairedEndStats make cd ~/kent/src/hg/encode3/encodeDataWarehouse/utils/edwBamStats make cd ~/kent/src/hg/encode3/encodeDataWarehouse/utils/edwSamRepeatAnalysis make cd ~/kent/src/utils/fastqStatsAndSubsample make cd ~/kent/src/utils/bigBedToBed make cd ~/kent/src/utils/bigWigAverageOverBed make 8) Make up a configuration file for accessing the database in ~/.hg.conf with the following six lines: db.host=localhost db.user=cdw db.password= cdw.host=localhost cdw.user=cdw cdw.password= where you get the password values from the system admins or Jim. (The db.password and cdw.password can be the same.) Do chmod 600 .hg.conf to make it private 9) Create a cdw user for the mariaSql database and also a cdw database that the cdw user has full authorities for. 10) Copy over the validation data from hgwdev cd /data/cirm scp -rp hgwdev:/data/cirm/valData . 11) Create the table structure for the cdw database by logging into mysql, doing hgsqladmin create cdw hgsql cdw < ~/kent/src/hg/cirm/cdw/lib/cdw.sql; 12) Populate the settings table by doing insert into cdwSettings set name='prefix',val='SSPG'; insert into cdwSettings set name='schema',val='/data/cirm/valData/tags.schema'; Note these lines control the prefix to the accession numbers for the files, and the tagStorm schema respectively. 13) Set up users for cdw system, people who can submit files: cdwCreateUser kent@soe.ucsc.edu hgsql -e "update cdwUser set isAdmin=1 where email='kent@soe.ucsc.edu'" cdw cdwCreateUser mhaeussl@ucsc.edu cdwCreateUser wisulliv@ucsc.edu 14) Run script to load up validations: ~/kent/src/hg/cirm/cdw/initQa -15) Find the following programs and make them accessible to the cdw user. - bowtie +15) Find the following bioinformatics programs and make them accessible to the cdw user. tabix (version 0.2.5 is what we use) bgzip (part of Heng Li's tabix kit) -16) Start up the daemons + bowtie (we use version 1.3.1) +16) Set up the bowtie indexes in ramdisk. (redo on reboot) + rsync -apv /data/cirm/valData/ramdisk/ /dev/shm/btData/ +17) Start up the daemons (redo on reboot) cd /data/cirm/cdw - cdwRunDaemon cdw cdwJob 5 -log=cdwJob.log + cdwRunDaemon cdw cdwJob 30 -log=cdwQa.log Sometimes the daemons will crash out. Usually it's on a new data type. Do a ps -u cdw to see if this has happened. If so then do killall cdwRunDaemon have a look at the cdwJob.log, and possibly the latest entries in the cdwJob database table, and then do cdwRunDaemon again as above to restart.