c717ea66509b0427bf27edc497f1c7f7d2e9ea95 dschmelt Mon Jul 11 12:52:35 2022 -0700 Adding section about genArk to the page, no RM diff --git src/hg/htdocs/FAQ/FAQdownloads.html src/hg/htdocs/FAQ/FAQdownloads.html index 819bcbf..9e034ea 100755 --- src/hg/htdocs/FAQ/FAQdownloads.html +++ src/hg/htdocs/FAQ/FAQdownloads.html @@ -40,30 +40,31 @@ <li><a href="#download18">Obtaining promoter sequence</a></li> <li><a href="#download19">Data from Evolutionary Conservation Score tracks</a></li> <li><a href="#download20">Minus strand coordinates - axtNet files</a></li> <li><a href="#download21">Mapping UCSC STS marker IDS to those of other groups</a></li> <li><a href="#download22">deCODE map data</a></li> <li><a href="#download29">Direct MariaDB (MySQL) access to data</a></li> <li><a href="#download34">Name of fourth column in BED output</a></li> <li><a href="#download36">Track data access</a></li> <li><a href="#snp">How do I download dbSNP data?</a></li> <li><a href="#snpAlleles">Why doesn't this SNP have two alleles?</a></li> <li><a href="#download37">Known issues with Table Browser GTF output</a></li> <li><a href="#download38">Table Browser output file not ordered</a></li> <li><a href="#download39">'Permisssion denied' error when trying to use command-line utilities</a></li> <li><a href="#download40">Restricted Track Data</a></li> <li><a href="#downloadAnalysis">What is the genome analysis set?</a></li> +<li><a href="#downloadGenArk">How do I download GenArk data?</li> </ul> <hr> <p> <a href="index.html">Return to FAQ Table of Contents</a></p> <a name="download1"></a> <h2>Downloading sequence and annotation data</h2> <h6>How do I obtain the sequence and/or annotation data for a release?</h6> <p> Sequence and annotation data downloads are usually made available within the first week of the release of a new assembly. The download directories are automatically updated nightly to incorporate additions and modifications to the data.</p> <p> You can download sequence and annotation data <a href="../goldenPath/help/ftp.html">using our FTP server</a>, but we recommend using rsync, which has the advantage of starting up where it left off @@ -1173,16 +1174,43 @@ <li>Removal of alternate and fix sequences which can interfere with read alignment programs</li> <li>Hard masking of duplicate copies of the pseudo-autosomal regions (PARs) and centromeric arrays<li> <li>Addition of "decoy" sequences</li> <li>Index files generated by BWA, Samtools, Bowtie and HISAT2</li></ul> <p> For more information on analysis sets, see the <a href="https://www.ncbi.nlm.nih.gov/genome/doc/ftpfaq/#seqsforalign" target="_blank">NCBI FAQ</a>. Information on what is contained in each specific assembly analysis set can be found in the README by clicking the <strong>Genome sequence files</strong> link for the assembly of interest in our <a href="http://hgdownload.soe.ucsc.edu/downloads.html">Downloads page</a>. </p> +<a name="downloadGenArk"></a> +<h2>GenArk Downloads</h2> +<h6>How do I download GenArk assembly hub data for my species?</h6> +<p> +For 2000+ GenArk genomes, we visualize them in assembly hubs instead of native +assemblies like hg38 and mm39. These Genome Browsers can be accessed from our +<a href="../cgi-bin/hgGateway">Genomes page</a> by searching common name or GCA/GCF +number. You can also access the browsers for these species directly with links in the +following format:</p> +<pre><a href="https://genome.ucsc.edu/h/GCF_000951035.1">https://genome.ucsc.edu/h/GCF_000951035.1</a></pre> +<p> +The downloads data for these assemblies is stored in a different location +than our goldenPath, SQL, or gbdb file directories. There are two ways to access +this data for download. First, you can go to the +<a href="https://hgdownload.soe.ucsc.edu/hubs">GenArk page</a> +and select your clade (primates, mammals, birds, etc.) and then +you will be brought to a page with a table of species and +GCA/GCF assembly identifiers. Find your genome and click on the third column, +labeled "Scientific name and data download", which will take you to the download +directory for that species. +</p><p> +Alternatively, you can enter your GCA/GCF identifier +in the URL in groups of three characters, seperated by slashes. For example, the +identifier "GCA_004027835.1" has data in the following directory: +<pre>https://hgdownload.soe.ucsc.edu/hubs/GCA/004/027/835/</pre> +</p> + <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->