bbabbd5d2566d47d923d51dbe350634783455999 mspeir Sun Oct 26 12:14:52 2025 -0700 change soe to gi, refs #35031 diff --git src/hg/htdocs/goldenPath/help/cloud.html src/hg/htdocs/goldenPath/help/cloud.html index a73904f2b0b..33b081f0f80 100755 --- src/hg/htdocs/goldenPath/help/cloud.html +++ src/hg/htdocs/goldenPath/help/cloud.html @@ -87,58 +87,58 @@
S3 stands for Simple Storage Service, and it is the name for cloud storage in Amazon Web Services (AWS). The data available through S3 is essentially stored in a folder called a bucket, and files are called objects. The s3://genome-browser bucket is a copy of the main data available on our -UCSC Genome Browser Download website: https://hgdownload.soe.ucsc.edu/downloads.html
+UCSC Genome Browser Download website: https://hgdownload.gi.ucsc.edu/downloads.htmlBy placing our Download server files in an S3 bucket, developers working in the cloud can more easily integrate with UCSC data. You can learn more about how S3-object-based storage works, and its advantages of being accessible anywhere across the world with low latency and high durability by reviewing Amazon's S3 documentation.
-The data mirrors our UCSC Genome Browser Download website's main rsync directories:
UCSC Human Golden Path Downloads s3://genome-browser/goldenPath UCSC Human Genome Browser Gbdb Data Files s3://genome-browser/gbdb UCSC Human Genome Raw Mysql Tables s3://genome-browser/mysql UCSC Human Genome Web Site CGI Binaries s3://genome-browser/cgi-bin UCSC Human Genome Web Site Htdocs s3://genome-browser/htdocs
goldenPath/hg38/bigZips/README.txt. The README.txt, also
-available on the Download website,
informs that the most recent patch-inclusive sequence is found in
goldenPath/hg38/bigZips/latest/.gbdb/hg38/hg38.2bit, matching the file in the
goldenPath/hg38/bigZips/latest/
directory, reflecting how these files are operated on by the UCSC Genome Browser software
in order to display assembly sequence when browsing.By reviewing example data access URLs demonstrating of list and getData functions and further practical examples URLs of extracting specific track data items you can learn more about the ways of using the API to extract data.
-The UCSC Genome Browser Download website, hgdownload.soe.ucsc.edu, is the source of the data +The UCSC Genome Browser Download website, hgdownload.gi.ucsc.edu, is the source of the data hosted in the Amazon s3://genome-browser bucket. It can be viewed in a web browser to access specific download files, or the data can be copied with rysnc commands.
For instance, the following rsync command will show you the various rysnc directories available on our Download server:
-$ rsync -a -P rsync://hgdownload.soe.ucsc.edu/ +$ rsync -a -P rsync://hgdownload.gi.ucsc.edu/ genome UCSC Human Genome Downloads sars UCSC Human Genome SARS Downloads htdocs UCSC Human Genome Web Site Htdocs goldenPath UCSC Human Golden Path Downloads cgi-bin UCSC Human Genome Web Site CGI Binaries x86_64 cgi-bin-i386 UCSC Human Genome Web Site CGI Binaries i386 gbdb UCSC Human Genome Browser Gbdb Config Files archives UCSC Human Genome Browser Archived Config Files mysql UCSC Human Genome Raw Mysql Tables gbib UCSC Genome Browser in a Box hubs UCSC Genome Browser Public Hubs
goldenPath/Downloads directory:rsync -a -P rsync://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/README.txt ./rsync -a -P rsync://hgdownload.gi.ucsc.edu/goldenPath/hg38/bigZips/README.txt ./
gbdb/ binary data directory
for the human hg38 assembly 2bit file:rsync -a -P rsync://hgdownload.soe.ucsc.edu/gbdb/hg38/hg38.2bit ./rsync -a -P rsync://hgdownload.gi.ucsc.edu/gbdb/hg38/hg38.2bit ./
htdocs/ hypertext document directory:
-rsync -a -P rsync://hgdownload.soe.ucsc.edu/htdocs/goldenPath/pubs.html ./rsync -a -P rsync://hgdownload.gi.ucsc.edu/htdocs/goldenPath/pubs.html ./
Many of these rsync directories exist to support the Genome Browser in a Cloud (GBiC) and the Genome Browser in a Box (GBiB) software products discussed below.
Also note that there is a mirror of the download server available in Europe so the above rysnc
commands can also be pointed to the hgdownload-euro locations.
rsync -a -P rsync://hgdownload-euro.soe.ucsc.edu/gbdb/hg38/hg38.2bit ./The UCSC Genome Browser uses MariaDB (fork of MySQL) as the backend database server and maintains -a public server at genome-mysql.soe.ucsc.edu to allow direct queries.
+a public server at genome-mysql.gi.ucsc.edu to allow direct queries.
trackDb all the entries in the group (grp) "genes" and
ordering those entries by tableName:
-mysql -h genome-mysql.soe.ucsc.edu -u genome -NBe 'select tableName from trackDb where grp = "genes" order by tableName' hg38
+mysql -h genome-mysql.gi.ucsc.edu -u genome -NBe 'select tableName from trackDb where grp = "genes" order by tableName' hg38
wgEncodeRegTfbsClusteredV3 on the human hg19 assembly
and selecting entries from a 500 base pair region on chr1:
-mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -Ne 'select chrom,chromStart,chromEnd,name,score
+mysql --user=genome --host=genome-mysql.gi.ucsc.edu -A -Ne 'select chrom,chromStart,chromEnd,name,score
from wgEncodeRegTfbsClusteredV3 where chrom = "chr1" and chromStart > 10000 and chromEnd < 10500;' hg19
wgEncodeGencodeBasicV39 table on the hg38 genome:
-mysql -u genome -h genome-mysql.soe.ucsc.edu hg38 -e 'select g.name,a.transcriptType from wgEncodeGencodeBasicV39 g,
+mysql -u genome -h genome-mysql.gi.ucsc.edu hg38 -e 'select g.name,a.transcriptType from wgEncodeGencodeBasicV39 g,
wgEncodeGencodeAttrsV39 a where (g.name = a.transcriptId) and (a.transcriptType = "lncRNA");'
See the Downloading Data using MariaDB (MySQL)
for more information. Also, there is a mirror of the MariaDb server available
in Europe so commands can also be pointed to the genome-euro-mysql location.
mysql -h genome-mysql-euro.soe.ucsc.edu -u genome -NBe 'show tables' hg38