cb55bf7cf5fe6f9561a8d026075252ee380b008e brianlee Sat Jun 11 07:34:50 2022 -0700 Adding to FAQ/FAQformat.html 2bit entry link to twoBit.html with extraction example of sequence ref #29548 diff --git src/hg/htdocs/FAQ/FAQformat.html src/hg/htdocs/FAQ/FAQformat.html index 6eff1c2..a67b3b3 100755 --- src/hg/htdocs/FAQ/FAQformat.html +++ src/hg/htdocs/FAQ/FAQformat.html @@ -928,31 +928,35 @@ <strong>maskBlockStarts</strong> - an array of length maskBlockCount of 32 bit integers indicating the (0-based) starting position of a masked block</li> <li> <strong>maskBlockSizes</strong> - an array of length maskBlockCount of 32 bit integers indicating the length of a masked block</li> <li> <strong>reserved</strong> - always zero for now</li> <li> <strong>packedDna</strong> - the DNA packed to two bits per base, represented as so: T - 00, C - 01, A - 10, G - 11. The first base is in the most significant 2-bit byte; the last base is in the least significant 2 bits. For example, the sequence TCAG is represented as 00011011.</li> </ul> <p> For a complete definition of all fields in the twoBit format, see <a href="http://genome-source.soe.ucsc.edu/gitlist/kent.git/raw/master/src/inc/twoBit.h">this</a> -description in the source code.</p> +description in the source code. Click these links to see examples of using the +<a href="../../goldenPath/help/twoBit.html" target ="_blank"><code>faToTwoBit</code>, +<code>twoBitInfo</code>, and <code>twoBitToFa</code></a> commands, and how to +<a href="../../goldenPath/help/twoBit.html#extract" target="_blank">extract DNA</a> from 2bit +files, including with our <a href="../../goldenPath/help/api.html" target="_blank">API</a>.</p> <a name="format8"></a> <h2>.nib format</h2> <p> The .nib format pre-dates the .2bit format and is less compact. It describes a DNA sequence by packing two bases into each byte. Each .nib file contains only a single sequence. The file begins with a 32-bit signature that is 0x6BE93D3A in the architecture of the machine that created the file (or possibly a byte-swapped version of the same number on another machine). This is followed by a 32-bit number in the same format that describes the number of bases in the file. Next, the bases themselves are listed, packed two bases to the byte. The first base is packed in the high-order 4 bits (nibble); the second base is packed in the low-order four bits:</p> <pre><code>byte = (base1<<4) + base2 </code></pre> <p> The numerical representations for the bases are:</p>