d5932e4051a4885f0036831cdfc5fbf1eb08a499 lrnassar Tue Oct 10 17:16:27 2023 -0700 Fixing broken blog links, refs #32439 diff --git src/hg/htdocs/goldenPath/help/twoBit.html src/hg/htdocs/goldenPath/help/twoBit.html index 2fa2f16..a7eb4cb 100755 --- src/hg/htdocs/goldenPath/help/twoBit.html +++ src/hg/htdocs/goldenPath/help/twoBit.html @@ -1,68 +1,68 @@ <!DOCTYPE html> <!--#set var="TITLE" value="Genome Browser TwoBit Sequence" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> <h1>TwoBit Sequence Archives</h1> <p> A twoBit file is a highly efficient way to store genomic sequence. The format is defined <a href="http://genome.ucsc.edu/FAQ/FAQformat.html#format7">here</a>. Note that lower-case nucleotides are considered masked in twoBit, which can cause such sequence to be ignored when using the <code>-mask</code> option with <code>gfServer</code>; therefore, you may wish to convert lower-case sequence to upper-case when preparing the FASTA format.</p> <p> To complete the steps below you must first download the <code>faToTwoBit</code>, <code>twoBitInfo</code>, and <code>twoBitToFa</code> utilities. For more information on downloading our command-line utilities, see these <A href="http://hgdownload.soe.ucsc.edu/downloads.html#source_downloads">instructions</a>.</p> <p> To create a twoBit file, follow these steps: <ol> <li> Prepare the sequence for your twoBit file in a FASTA-formatted file (i.e. genome.fa).</li> <li> Run the <code>faToTwoBit</code> program on your FASTA file: <pre><code> faToTwoBit genome.fa genome.2bit</code></pre></li> <li> Use <code>twoBitInfo</code> to verify the sequences in this assembly and create a chrom.sizes file, which is useful to construct the big* files in later processing steps:<br> <pre><code> twoBitInfo genome.2bit stdout | sort -k2rn > genome.chrom.sizes</code></pre></li> </ol> <p> The twoBit commands can function with the .2bit file as a URL: <pre><code> twoBitInfo -udcDir=. http://your-website.edu/~user/genome.2bit | sort -k2nr > genome.chrom.sizes</code></pre></p> <p> Sequence can be extracted from the .2bit file with the <code>twoBitToFa</code> command, for example: <pre><code> twoBitToFa -seq=chr1 -udcDir=. http://your-website.edu/~user/genome.2bit stdout > genome.chr1.fa</code></pre></p> <h3 id=extract>Examples of extracting sequences</h3> <p> -See these series of blog posts about <a href="http://genome.ucsc.edu/blog/?s=programmatic" +See these series of blog posts about <a href="https://genome-blog.gi.ucsc.edu/blog/?s=programmatic" target="_blank">Accessing the Genome Browser Programmatically</a> to see examples of extracting sequences remotely, such as the following: <pre> $ twoBitToFa http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.2bit:chr1:100100-100200 stdout >chr1:100100-100200 gcctagtacagactctccctgcagatgaaattatatgggatgctaaatta taatgagaacaatgtttggtgagccaaaactacaacaagggaagctaatt </pre></p> <p> Also, see the <a href="api.html#getData_examples" target="_blank">API getData functions</a> to see examples of using the URL, such as the following:</p> <p> <a href="https://api.genome.ucsc.edu/getData/sequence?genome=hg38;chrom=chr1;start=100100;end=100200" target="_blank">https://api.genome.ucsc.edu/getData/sequence?genome=hg38;chrom=chr1;start=100100;end=100200</a> <pre> downloadTime: "2022:05:19T18:45:56Z" downloadTimeStamp: 1652985956 genome: "hg38" chrom: "chr1" start: 100100 end: 100200 dna: "gcctagtacagactctccctgcagatgaaattatatgggatgctaaattataatgagaacaatgtttggtgagccaaaactacaacaagggaagctaatt" </pre></p> <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->