96c750edb89fd5bde5eca4dbf167578f25bb00b4
brianlee
  Mon Jun 7 16:16:21 2021 -0700
Updating CyVerse section and cleaning up some htmlValidate check paragraph items. refs #26511

diff --git src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html
index f4caf49..3bab9da 100755
--- src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html
+++ src/hg/htdocs/goldenPath/help/hgTrackHubHelp.html
@@ -42,31 +42,31 @@
   Groupings</a></li> 
   <li>
   <a href="hubQuickStartAssembly.html" target="_blank">Quick Start Guide to Assembly Hubs with 
   Blat</a></li> 
   <li><a href="hubQuickStartSearch.html" target="_blank">Quick Start Guide to Searchable Track 
   Hubs</a></li>
 </ul> 
 <div> 
 <form name="googleForm1" method="GET" action="http://www.google.com/search" onSubmit="document.googleForm1.q.value=document.googleForm1.qq.value+'   site:genome.ucsc.edu/goldenPath/help';"> 
   <p>
   Search the Genome Browser help pages: &nbsp; 
   <input type="hidden" name="q" value=""> 
   <input type="hidden" name="num" value="10"> 
   <input type="hidden" name="filter" value="0"> 
   <input type=text name=qq size=30 maxlength=255 value=""> 
-  <input type="submit" value="Submit"> 
+  <input type="submit" value="Submit"></p>
 </form> 
 </div> 
 <p> 
 <a href="../../contacts.html">Questions and feedback are welcome</a>.</p>
 
 <!-- ========== What Are Track Hubs? ============================== -->
 <a name="Intro"></a>
 <h2>What Are Track Hubs?</h2> 
 <p>
 Track hubs are web-accessible directories of genomic data that can be viewed on the UCSC Genome 
 Browser (please note that hosting hub files on HTTP tends to work even better than FTP and local 
 hubs can be displayed on <a  href="hubQuickStartAssembly.html#blatGbib" target="_blank">GBiB</a>).  
 Track hubs can be displayed on genomes that UCSC directly supports, or on your own sequence. Hubs 
 are a useful tool for visualizing a large number of genome-wide data sets. For example, a project 
 that has produced several wiggle plots of data can use the hub utility to organize the tracks into 
@@ -436,36 +436,36 @@
 <strong><em>Example 2:</em></strong> Sample hub.txt file defining attributes for the track hub 
 shown in <em>Example 1</em>.</p>
 <pre><code><strong>hub</strong> UCSCHub
 <strong>shortLabel</strong> UCSC Hub
 <strong>longLabel</strong> UCSC Genome Informatics Hub for human DNase and RNAseq data
 <strong>genomesFile</strong> genomes.txt
 <strong>email</strong> genome@soe.ucsc.edu
 <strong>descriptionUrl</strong> ucscHub.html </code></pre>
 <hr>
 <p>
 <strong>Step 5. Create the genomes.txt file</strong><br>
 Create a genomes.txt file within the track hub directory that contains a two-line stanza that must 
 be separated by a line for each genome assembly that is supported by the hub data. Each stanza shows
 the location of the trackDb file that defines display properties for each track in that
 assembly, as well as an optional metadata storage file</p>
-<pre><code><strong>genome</strong> <em>assembly_database_1</em>
+<pre><strong>genome</strong> <em>assembly_database_1</em>
 <strong>trackDb</strong> <em>assembly_1_path/trackDb.txt</em>
-<strong>metaTab</strong> <em>assembly_1_path/tabSeparatedFile.txt</em> </code></pre>
+<strong>metaTab</strong> <em>assembly_1_path/tabSeparatedFile.txt</em></pre>
 <pre><strong>genome</strong> <em>assembly_database_2</em>
-<strong>trackDb</strong> <em>assembly_2_path/trackDb.txt</em> </code>
-<strong>metaDb</strong> <em>assembly_2_path/tagStormFile.txt</em> </code></pre>
+<strong>trackDb</strong> <em>assembly_2_path/trackDb.txt</em>
+<strong>metaDb</strong> <em>assembly_2_path/tagStormFile.txt</em></pre>
 <p>
 <em>genome</em> - a valid UCSC database name. Each stanza must begin with this tag and each stanza 
 must be separated by an empty line.</p> 
 <p>
 <em>trackDb</em> - the relative path of the trackDb file for the assembly designated by the 
 <em>genome</em> tag. By convention, the trackDb file is located in a subdirectory of the hub 
 directory. However, the trackDb tag may also specify a complete URL.</p>
 <p><em>metaDb</em> - the path to an optional tagStorm file that has the metadata for each track. 
 Each track with metadata should have a &quot;meta&quot; tag specified in the trackDb stanza for 
 that track and a &quot;meta&quot; tag in the tagStorm file.</p>
 <p><em>metaTab</em> - the path to an optional tab separated file that has the metadata for each 
 track. Each track with metadata should have a &quot;meta&quot; tag specified in the trackDb stanza 
 for that track and a &quot;meta&quot; tag  in the tab separated file. The first line of the TSV 
 file should start with a '#' and have the field names for each column, one of them being 
 &quot;meta&quot;.</p>
@@ -674,56 +674,55 @@
 genome.txt, and trackDb.txt settings and displays warnings and errors in bright red font, such 
 as "<font color="red">Missing required setting...</font>" and 
 "<font color="red">Cannot open...</font>". The "Display load times" and "Enable
 hub refresh" optional settings show the load timing at the bottom of the Genome Browser page
 and allow instant hub refresh instead of 5 minute refresh. These options can be checked and
 activated by clicking "View Hub on Genome Browser". The following picture shows 
 <a href="examples/hubExamples/hubGroupings/hub.txt">the example track grouping hub</a>
 with the warning that the hub has no hub description page, no configuration errors,
 and "Display load times" checked:</p>
 
 <p class='text-center'>
   <img class='text-center' src="../../images/hubDevelopment.png" 
 alt="The Hub Development tool checks config setting" width="749" height="249">
   <p class='gbsCaption text-center'>The Hub Development tool checks for proper configuration 
 files and track hub settings, and allows access to debugging settings.</p>
-</p>
 
 <h3>Check hub settings using hubCheck utility</h3>
 <p>
 It is a good practice to run the command-line utility <em>hubCheck</em> on your track hub when you
 first bring it online and whenever you make significant changes. This utility by default checks
 that the files in the hub are correctly formatted, but it can also be configured to check a few
 other things including that various trackDb settings are correctly spelled and that they are
 supported by the UCSC Genome Browser. You can read more about using hubCheck to check the
-compatibility of your hub with other genome browsers <a href="#Compatibility"</a>below</a>.
+compatibility of your hub with other genome browsers <a href="#Compatibility"</a>below</a>.</p>
 
 <p>
 Here is the usage statement for the hubCheck utility:
 <pre><code>hubCheck - Check a track data hub for integrity.
 usage:
    hubCheck http://yourHost/yourDir/hub.txt
 options:
    -checkSettings        - check trackDb settings to spec
    -version=[v?|url]     - version to validate settings against
                                      (defaults to version in hub.txt, or current standard)
    -extra=[file|url]     - accept settings in this file (or url)
    -level=base|required  - reject settings below this support level
    -settings             - just list settings with support level
                            Will create this directory if not existing
    -noTracks             - don't check remote files for tracks, just trackDb (faster)
-   -udcDir=/dir/to/cache - place to put cache for remote bigBeds and bigWigs </code></pre>
+   -udcDir=/dir/to/cache - place to put cache for remote bigBeds and bigWigs </code></pre></p>
 <p>
 Note that you will have to use the udcDir if /tmp/udcCache is not writable on your machine.</p>
 <p>
 The hubCheck program is available from the UCSC downloads server at 
 <a href="http://hgdownload.soe.ucsc.edu/admin/exe/">http://hgdownload.soe.ucsc.edu/admin/exe/</a>.</p>
 
 <a name="troubleConnecting"></a>
 <h3>Troubleshooting Track Hub connections</h3>
 <p>
 If the browser is unable to load a track hub, it will display an error message. Some common causes
 for an import to fail include typos in the URL, a hub server that is offline, 
 or errors in the track hub configuration files. Occasionally, remote track
 hubs may be missing, off-line, or otherwise unavailable. If a user is
 already browsing data from the remote hub when it disconnects, a yellow error message will be
 displayed instead of the expected data.</p>
@@ -845,31 +844,31 @@
 <p>
 <strong><em>Example 3:</em></strong> Checking your settings against those provided by UCSC and another
 source, such as Ensembl.</p>
 <p>
 If you want to check the settings in your hub against those supported by other genome browsers,
 you will first need to create a single-column file that lists each non-UCSC setting and then
 use the &quot;-extra=&quot; option to specify this file when running hubCheck. For example,
 if you knew that a setting called &quot;ensemblAssemblyName&quot; was supported for use in track
 hubs by Ensembl, you could create a single line file that included  the setting
 &quot;ensemblAssemblyName&quot;. Then, when you want to check a hub that includes these extra
 trackDb settings, you would then specify this extra settings file on the command line:</p>
 <pre><code>$ hubCheck  -checkSettings -extra=http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubCheckUnsupportedSettings/myExtraSettings.txt http://genome.ucsc.edu/goldenPath/help/examples/hubExamples/hubCheckUnsupportedSettings/hub.txt
 </code></pre>
 <p>
 (Note: The settings listed here in the &quot;extra&quot; file are
-just examples and do not represent real trackDb variables for hubs at Ensembl.)
+just examples and do not represent real trackDb variables for hubs at Ensembl.)</p>
 
 <!-- ========== Where to host your data ============================== -->
 <a name="Hosting"></a>
 <h2>Where to host your data?</h2> 
 As stated in <a href="#Intro">What Are Track Hubs?</a>, track hubs files must be located
 in web-accessible locations that support byte-range requests. Four options for hosting include:
 <ul>
   <li>Your institution's Information Technology services
   <li>Commercial webspace providers
   <li>Commercial cloud providers
   <li>Free webspace providers
 </ul>
 <p>
 <b>Your Institution:</b> Many universities provide a location for researchers
 to place shareable data on the web and contacting your institution's system
@@ -909,107 +908,104 @@
 OneDrive, Tencent Weiyun, Yandex.Disk, etc.) do not work reliably as their business
 model requires rare and rate-limited data access, which is too slow or too limited for
 genome annotation display. However, commercial
 cloud <b>storage</b> offers that charge per GB transferred (Amazon S3, Microsoft
 Azure Storage, Google Cloud Storage, Backblaze, Alibaba Object Store, etc.)
 typically do work. As of 2020, they cost around 2-3 US cents/GB/month to store
 the hub data and 12-18 US cents per GB transferred, when the hub is used.
 For optimal performance, select a San Francisco / San Jose data center for the main
 UCSC site genome.ucsc.edu,  a Frankfurt/Germany data center for
 genome-euro.ucsc.edu and a Tokyo data center for genome-asia.ucsc.edu. You may
 also want to review this discussion about issues with
 <a href="http://genomewiki.ucsc.edu/index.php/Cloud-storage_providers_and_byte-range_requests_of_UCSC_big*_files"
 target="_blank">distributed storage servers</a>. <b>These services are external
 to UCSC and may change.</b></p>
 
-</p><b>Free webspace:</b> If you do not want to pay for web space,
+<p><b>Free webspace:</b> If you do not want to pay for web space,
 and your institution does not provide a data location supporting byte-range requests,
 we know of at least the following sites where you can host
 research data and configuration files for free:
 <ul>
   <li><a href="https://de.cyverse.org/de/" target="_blank">CyVerse Discovery Environment</a> - lots of space, but can be relatively slow to display</li>
   <!--<li><a href="https://usegalaxy.org/" target="_blank">Galaxy</a></li>-->
   <li><a href="https://github.com/" target="_blank">Github</a> - files limited to 100MB, but very fast</li>
   <li><a href="https://figshare.com/" target="_blank">Figshare</a> - not limited and fast, but every file needs to be uploaded individually and cannot be changed. Optimal for very stable links, e.g. in publications.</li>
 </ul>
+<p>
 Each of the providers above has a slightly different approach to hosting data for
 compatibility with the UCSC Genome Browser, and may have different advantages and disadvantages,
 such as size limitations, usage statistics, and version control integration. Additionally, as 
 previously mentioned, any provider that supports byte-range access will work for hub hosting, 
 and you are not limited to the above sites. Below is a summarized guide for
 each of the providers mentioned above.</p>
 
 <h3>Hosting Hubs on CyVerse</h3>
 <p>
 <a href="http://www.cyverse.org/" target="_blank">CyVerse</a>, previously known as the iPlant Collaborative, is 
 an NSF-funded site created for assisting data scientists with their data storage and compute 
 needs. Data hosting by CyVerse is free for academic groups and they support byte-range access, 
 so they can be used for track hubs. However, Cyverse is sometimes slow, 
 and may result in error messages if your hub includes many tracks that are 
 meant to be shown at the same time by your users.</p>
-
-<p>In order to host your data on CyVerse, you first must create an account and then use their
-<a href="https://de.cyverse.org/de" target="_blank">Discovery Environment</a> to upload data. After creating an 
-account, use the &quot;Upload&quot; and &quot;Simple Upload&quot; buttons to upload files 
-individually as shown below:
+<p>
+In order to host your data on CyVerse, you first must create an account and then use their
+<a href="https://de.cyverse.org/de" target="_blank">Discovery Environment</a> to upload data.
+After creating an account and signing in, access the data screen by clicking the second icon
+on the left.  Use the &quot;Upload&quot; button on the far right to import data from a URL or
+locally from your machine.
 <div class="text-center">
-  <img height="400px" src="../../images/cyverseUploadButton.png">
-</div>
+  <img height="150px" src="../../images/cyverseUploadButton.png">
+</div></p>
 <p>
 You can also use the command line utility
-<a href="https://pods.iplantcollaborative.org/wiki/display/DS/Setting+Up+iCommands" target="_blank">iCommands</a> 
-to facilitate bulk transfer of data (best used for large files in the 2-100 GB range), or use 
-<a href="https://pods.iplantcollaborative.org/wiki/display/DS/Using+Cyberduck+for+Uploading+and+Downloading+to+the+Data+Store" target="_blank">
-Cyberduck</a> to bulk transfer up to 80 GB of data in one go.</p>
-
-<p>
-After uploading some data, check the &quot;Info-Type&quot; of your BAM, bigWig, bigBed, etc. files.
-If an Info-Type has not been selected automatically or if it is incorrect, make sure it 
-is correct. If uploading an assembly hub, assign the Info-Type &quot;bed&quot; to the 2bit file, as
-well as any text files, like your trackDb.txt, groups.txt, or description.html.
-</p>
-
-<p>
-After giving an appropriate type (like &quot;bam&quot;) to your binary files, you must update any
-text files to point to CyVerse locations. For example, your <em>hub.txt</em> will contain a line
-like:
-<pre><code>genomesFile genomes.txt</code></pre>
-<p>Which must be edited to point to a CyVerse URL such as:</p>
-<pre><code>genomesFile https://data.cyverse.org/dav-anon/iplant/home/...</code></pre>
-<p>Luckily, CyVerse allows you to edit these text files after uploading them, so you can create a 
-&quot;Send To: Genome Browser&quot; link:</p>
+<a href="https://cyverse.atlassian.net/wiki/spaces/DS/pages/241869823/Setting+Up+iCommands"
+target="_blank">iCommands</a>  to facilitate bulk transfer of data (best used for
+large files in the 2-100 GB range), or use
+<a href="https://cyverse.atlassian.net/wiki/spaces/DS/pages/241869843/Using+Cyberduck+for+Uploading+and+Downloading+to+the+Data+Store"
+target="_blank">Cyberduck</a> to bulk transfer up to 80 GB of data in one go.</p>
+<p>
+Once your file is available use the three dots on the far right to click the &quot;Public
+Links(s)&quot; option.</p>
 <div class="text-center">
-  <img height="400px" src="../../images/cyverseSendToGenomeBrowser.png">
+  <img height="250px" src="../../images/cyverseCreatePublicLink.png">
 </div>
-<p>And then edit the fields of your <em>hub.txt</em>, <em>genomes.txt</em>, and <em>trackDb.txt</em> 
-files like so:</p>
+<p>
+Select this option for all the files you will be using with the Genome Browser, whether they
+are text-based files (trackDb.txt, groups.txt, description.html, etc.) or binary-indexed files
+(BAM, bigWig, bigBed, etc.) requiring byte-range access. Note, if you have a dataFile.bam,
+you  must also have a dataFile.bam.bai file of the matching name and both must have public
+links created.</p>
+<p>
+After creating public links to your binary files, you must ensure your text files (i.e., trackDb.txt)
+point to the CyVerse locations for the files.  For instance, the bigDataUrl setting, will need to point
+to the location of the BAM, bigWig, or bigBed (i.e., <code>bigDataUrl https://data.cyverse.org/...
+/dataFile.bam</code>).</p>
 <div class="text-center">
-  <img height="500px" width="1000px" src="../../images/cyverseEditedPaths.png">
+  <img height="200px" src="../../images/cyverseCreatePublicLink2.png">
 </div>
-<p>To get the correct links to bigData files, again be sure to use the &quot;Send To: Genome 
-Browser&quot; links in the menu.
-</p>
-
 <p>
-Please see the <a href="https://wiki.cyverse.org/wiki/display/DEmanual/Viewing+Genome+Files+in+a+Genome+Browser" target="_blank">
-Viewing Genome Files in a Genome Browser</a> wiki page on the CyVerse wiki for more information
-(please note the difference of the Data Commons for final curated publication material,
-and the Discovery Environment for developing data).
+The hub.txt file (if not using the <a href="hgTracksHelp.html#UseOneFile"
+target="_blank">useOneFile on</a> setting) will need to point the related
+genomes.txt location, which in turn points to the trackDb.txt location
+using these full https://data.cyverse.org/... links as well.</p>
+<p>
+Please see the <a href="https://cyverse.atlassian.net/wiki/spaces/DEmanual/pages/242027070/Using+the+Discovery+Environment"
+target="_blank">Using the Discovery Environment</a> wiki page on the CyVerse wiki for more information.
 Please direct any questions about CyVerse or the Discovery Environment to their 
-<a href="http://www.cyverse.org/learning-center/ask-cyverse" target="_blank">Ask CyVerse</a> page or contact
-Cyverse support staff directly via the blue Intercom button on the bottom right of the
-Discovery Environment page.
+<a href="https://cyverse.org/contact" target="_blank">Contact Us</a> page or &quot;Chat with
+Cyverse support&quot; staff directly via the blue question box icon on the top right of the
+Discovery Environment page.</p>
 <!-- Galaxy stub
 <h3>Hosting Hubs on Galaxy</h3>
 <p>
 Galaxy is 
 </p>
 -->
 <h3>Hosting Hubs on Github</h3>
 <p>
 <a href="https://github.com" target="_blank">Github</a> supports byte-range access to files when they are accessed
 via the <em><b>raw.githubusercontent.com</b></em> style URLs. To obtain a raw URL to a file already 
 uploaded on Github, click on a file in your repository and click the <em>Raw</em> button:</p>
 <div class="text-center">
   <img height="275px" width="55%" src="../../images/githubRawLink.png" alt="Location of the Raw button for generating a plaintext URL to a file hosted on Github.">
 </div>
 <p>