a4305926f0e89a8af454e34709174f7ea07fb5bb galt Thu Mar 6 00:41:37 2025 -0800 Add support for DropBox for hubs and customtracks using byteranges. fixes #35342 diff --git src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html index 3e14e8778b5..09da8021cb0 100755 --- src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html +++ src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html @@ -1259,39 +1259,49 @@
Free webspace: If you do not want to pay for web space, and your institution does not provide a data location supporting byte-range requests, we know of at least the following sites where you can host research data and configuration files for free:
Each of the providers above has a slightly different approach to hosting data for compatibility with the UCSC Genome Browser, and may have different advantages and disadvantages, such as size limitations, usage statistics, and version control integration. Additionally, as previously mentioned, any provider that supports byte-range access will work for hub hosting, and you are not limited to the above sites. Below is a summarized guide for each of the providers mentioned above.
+ +Galaxy is an open-source platform for FAIR data analysis that enables users to streamline and the analysis of genomic data and serves as a comprehensive toolkit for researchers, scientists, and bioinformaticians.
The Galaxy platform provides a user-friendly web interface that allows you to host your data. You can navigate to the Galaxy platform, log in to your account (or create an account if you haven't already), and use the data upload functionality provided to host your data. Once uploaded, the data will be stored securely on the platform and made available for your analysis.
CyVerse, previously known as the iPlant Collaborative, is an NSF-funded site created for assisting data scientists with their data storage and compute needs. Data hosting by CyVerse is free for academic groups and they support byte-range access, so they can be used for track hubs. However, Cyverse is sometimes slow, and may result in error messages if your hub includes many tracks that are meant to be shown at the same time by your users.
In order to host your data on CyVerse, you first must create an account and then use their Discovery Environment to upload data. After creating an account and signing in, access the data screen by clicking the second icon on the left. Use the "Upload" button on the far right to import data from a URL or locally from your machine.
Note, if you need to replace files once they have been uploaded into CyVerse's DE and Public links
have already been created you will need to force update the CyVerse cache. One way is to go back to the
Public links section and find "Refresh Cache" button. Another way is by hitting Control-Shift-R
in your browser to force reload the file or by sending Cache-Control: no-cache header: curl
--head --header 'Cache-Control: no-cache' https://data.cyverse.org/.../dataFile.bam
Github supports byte-range access to files when they are accessed via the raw.githubusercontent.com style URLs. To obtain a raw URL to a file already uploaded on Github, click on a file in your repository and click the Raw button:
The "Raw" button results in a plain text page like the following:
The bigDataUrl field (and any other statement pointing to a URL like @@ -1491,61 +1504,119 @@ relatively small file size upload limit compared to other hosting providers.
For an example public hub hosted on Github, please see the Human cellular microRNAome barChart hub.
For more information about moving files to Github, please see Github's help pages. Please direct any questions about Github to their help desk.
+Figshare is a site for researchers and institutions to upload and collect usage statistics on their data, as well as make their data shareable and discoverable. The process for uploading a hub to Figshare is similar to the process involved at CyVerse, where one must first create an account, upload the bigDataUrl files, create shareable links, and then edit your hub.txt, genomes.txt, and trackDb.txt appropriately. One advantage to using Figshare is their emphasis on usage statistics, so institutional accounts can see how often their hubs and tracks are being accessed by others.
Note that Figshare does not use filenames as part of the URLs, therefore bigDataUrl files that require a separate index file, like VCFs and BAM files, must have their index file location specified with a bigDataIndex. This keyword is relevant for Custom Tracks and Track Hubs. You can read more about bigDataIndex in the TrackDb Database Definition page.
If you are having issues hosting at figshare, try to use the file's download URL. This URL will have "ndownloader" in its path. Also, for custom tracks you will need to declare a track line with track type and bigDataUrl. Below is a simple example of a bigBed custom track on hg38:
track type=bigBed name="figshare example" bigDataUrl=https://figshare.com/ndownloader/files/38068053+ +
+DropBox is a site for users to share data files.
+DropBox has recently added support byteranges, a feature we need for hubs and custom tracks.
+DropBox has 2 GB in their BasicFree plan for free.
+Dropbox Plus for $10 a month -- $120 per year, and you must pay the full year to get the lowest price --
+1 user with 2 TB files up to 50GB maximum individual file size.
+They have lots of other plans available to for users, businesses, and schools.
+One must first create a DropBox account,
+upload the bigDataUrl files, or drag-and-drop from local disk.
+DropBox also has an app. They provide support for MacOS and Android too.
+Copy the share URL provided.
+
+Edit your hub.txt appropriately.
+Use useOneFile on
hub setting that allows the hub
+properties to be specified in a single file. More information about this setting can be found on the
+Genome Browser User Guide. If you would
+like to add metadata to your track hub, the following metadata guide
+contains examples of how to include the information in your tracks.
-For more information on using Figshare, please see their -Support Portal.
+Click share on each data file, choose copy link, Ctrl-C +and paste the dropbox URL into the hub oneFile the bigDataUrl field in the trackDb section of your useOneFile hub txt. +It will look somehting like this: + ++ bigDataUrl https://www.dropbox.com/scl/fi/8t785o3sqidp0tmar91bf/dnaseRep3.bw?rlkey=37wucbhdvwqntw4ejvig4kg7c&st=11v8l216&dl=0 ++ +
+Click the share button over your hub .txt file and Copy Link, Ctrl-C. +Paste that dropbox hub url into the hgHubConnect CGI tab. +
++ https://www.dropbox.com/scl/fi/6wrobg6wqcgtm7khew4qo/hub1.txt?rlkey=vi32q9tb68kjpn2xhy0qjuqkf&st=nldzp2tw&dl=0 ++
+Note that DropBox does not use pathnames as part of the URLs, therefore bigDataUrl files +that require a separate index file, like VCFs and BAM files, must have their index file +location specified with a bigDataIndex. This keyword is relevant for Custom Tracks +and Track Hubs. You can read more about bigDataIndex in +the TrackDb Database Definition page. +
+ ++For custom tracks you will need to declare a track line with track type and bigDataUrl. + Below is a simple example of a bigBed custom track on hg38:
++track type=bigWig name="dropbox example" bigDataUrl=https://www.dropbox.com/scl/fi/8t785o3sqidp0tmar91bf/dnaseRep3.bw?rlkey=37wucbhdvwqntw4ejvig4kg7c&st=1z1jce0t&dl=0 ++ +
+For more information on using DropBox, please see their +DropBox Support.
+ ++DropBox also provides a folder feature to help organize and group related data. +
+
When your own institution's system administrators are hosting your data they may benefit from this section about ensuring a secure HTTPS configuration. The most popular web servers that system admins use are Apache and NGINX. Instructions for setting up these popular web servers are found all over the web, so this section will not cover those here.
Certs and Security
As security on the Internet becomes increasingly important, SSL certificates are often
required for proper server installation. Proper certificate validation helps stop
"Man-In-The-Middle" attacks by ensuring that connections go to the correct
server and not some fake imposter site. This process requires SSL certificates that
have not expired, and whose domain name matches the domain name specified in the HTTPS URL.