a9fde73d32daf74780765442de44324061b01d66 markd Sun Jan 22 22:13:52 2023 -0800 Add URL resolver plugin functionality to allow an external program to convert cloud URLs (s3:, gs: drs:, or really any non-HTTP RUL) to http/https URLs. This can include signed URLs. The cloud URL is used to index the UDC cache rather than the resolved URL. This allows for re-resolving signed URLs if they time out. Joint work by Max and Markd diff --git src/product/mirrorManual.txt src/product/mirrorManual.txt index 4d5ca18..2f9a4f8 100644 --- src/product/mirrorManual.txt +++ src/product/mirrorManual.txt @@ -1389,30 +1389,56 @@ The httpProxy and httpsProxy URLs should use http protocol, not https. One reason for this is that https sessions would end up doubly-encoded. If you are debugging your proxy configuration, you can use this hg.conf setting to turn on logging to stderr. logProxy=on It is not meant to be left on in production. Your proxy server should have its own logging features. net.c also responds to environment variables http_proxy, https_proxy, ftp_proxy, no_proxy and log_proxy. +# Support for cloud URLs + +The genome browser supports cloud URLs by allowing a mirror to configure a +command to resolve these URLs to http or https URLs. + +For example, URLs with the gs, s3, or drs scheme would be resolved to https: +URLs. This may involve conversion to signed https URLs. The browser caches +the results under the original URL and will call the resolver command again if +the signed URL has timed out. + +URL resolution is enabled by defining following variables in hg.conf: + +- resolvProts: A comma-separated list of URL protocols, without the colon, to + resolve with the specified command. +- resolvCmd: The path to the command to resolve the matching URLs. It may + include space-separated arguments that are passed on to the command + unchanged. The URL to resolve is passed as the last argument. The command + writes the resolved http/https URL to stdout. If an error occurs, the + command should write an error message to stderr and exit with a non-zero + status code. + +For example: + + resolvProts = gs,drs + resolvCmd = /var/www/tools/urlResolver /var/www/tools/config + # The UDC local cache directory The udcCache allows tracks that are either installed tracks or custom tracks of the above mentioned types to cache data that they have already fetched via URL. This allows data to reside elsewhere and only download the parts needed on demand. The datablocks are usually compressed and have an efficient random access index. They are accessed from a remote location via URLs such as HTTP, HTTPS, FTP. * udcCache means URL-Data-Cache * BBI files use the udcCache. * BBI means Big Binary Indexed and includes file types such as BigBed (.bb) and BigWig (.bw).