b44df11903f0c8537dc9d716df324906bc89f92b lrnassar Thu Jun 15 13:21:32 2023 -0700 Adding documentation supporting snakes rearrangement display, refs #31241 diff --git src/hg/htdocs/goldenPath/help/chain.html src/hg/htdocs/goldenPath/help/chain.html index a8ce96e..cf1b183 100755 --- src/hg/htdocs/goldenPath/help/chain.html +++ src/hg/htdocs/goldenPath/help/chain.html @@ -1,89 +1,112 @@ <!DOCTYPE html> <!--#set var="TITLE" value="Genome Browser Chain Format" --> <!--#set var="ROOT" value="../.." --> <!-- Relative paths to support mirror sites with non-standard GB docs install --> <!--#include virtual="$ROOT/inc/gbPageStart.html" --> <h1>Chain Format</h1> <p> The chain format describes a pairwise alignment that allow gaps in both sequences simultaneously. Each set of chain alignments starts with a header line, contains one or more alignment data lines, and terminates with a blank line. The format is deliberately quite dense.</p> <p> <strong><em>Example:</em></strong></p> <pre><code> chain 4900 chrY 58368225 + 25985403 25985638 chr5 151006098 - 43257292 43257528 1 9 1 0 10 0 5 61 4 0 16 0 4 42 3 0 16 0 8 14 1 0 3 7 0 48 chain 4900 chrY 58368225 + 25985406 25985566 chr5 151006098 - 43549808 43549970 2 16 0 2 60 4 0 10 0 4 70</code></pre> <p> <strong>Header Lines</strong> <p> <pre><code> <strong>chain</strong> <em>score</em> <em>tName</em> <em>tSize</em> <em>tStrand</em> <em>tStart</em> <em>tEnd</em> <em>qName</em> <em>qSize</em> <em>qStrand</em> <em>qStart</em> <em>qEnd</em> <em>id</em> </code></pre> <p> The initial header line starts with the keyword <strong><code>chain</code></strong>, followed by 11 required attribute values, and ends with a blank line. The attributes include: <ul> <li> <strong><code>score</code></em></strong> -- chain score</li> <li> <strong><code>tName</code></strong> -- chromosome (reference/target sequence)</li> <li> <strong><code>tSize</code></strong> -- chromosome size (reference/target sequence)</li> <li> <strong><code>tStrand</code></strong> -- strand (reference/target sequence)</li> <li> <strong><code>tStart</code></strong> -- alignment start position (reference/target sequence)</li> <li> <strong><code>tEnd</code></strong> -- alignment end position (reference/target sequence)</li> <li> <strong><code>qName</code></strong> -- chromosome (query sequence)</li> <li> <strong><code>qSize</code></strong> -- chromosome size (query sequence)</li> <li> <strong><code>qStrand</code></strong> -- strand (query sequence)</li> <li> <strong><code>qStart</code></strong> -- alignment start position (query sequence)</li> <li> <strong><code>qEnd</code></strong> -- alignment end position (query sequence)</li> <li> <strong><code>id</code></strong> -- chain ID</li> </ul> <p> The alignment start and end positions are represented as zero-based half-open intervals. For example, the first 100 bases of a sequence would be represented with start position = 0 and end position = 100, and the next 100 bases would be represented as start position = 100 and end position = 200. When the strand value is "-", position coordinates are listed in terms of the reverse-complemented sequence.</p> <p> <strong>Alignment Data Lines</strong></p> <p> Alignment data lines contain three required attribute values:<p> <pre> <em>size</em> <em>dt</em> <em>dq</em></pre> <ul> <li> <strong><code>size</code></strong> -- the size of the ungapped alignment</li> <li> <strong><code>dt</code></strong> -- the difference between the end of this block and the beginning of the next block (reference/target sequence)</li> <li> <strong><code>dq</code></strong> -- the difference between the end of this block and the beginning of the next block (query sequence)</li> </ul> <p> NOTE: The last line of the alignment section contains only one number: the ungapped alignment size of the last block.</p> +<a name="rearrangement"></a> +<h3>"Snake" rearrangement display</h3> +<p> +Rearrangement display, sometimes called snakes display, is an alternative way to view pairwise alignemnts. +It is available for PSL and chain format tracks.</p> +<p> +Rearrangement display is a representation of the path that the sequence follows in the "other" sequence. +You start in the upper left and move to the right, following the lines if you come to the end of a block. If a block +is red, which means it is a match on the negative strand, then you reverse your course and start going from right to left. +The gray lines mean there are no bases in the other sequence between the blocks. Orange lines means there are some +bases in there that are not aligning.</p> +<p>The display type can be enabled on the track configuration page of eligible tracks. Below are two examples for clarity.</p> +<p> +<div class="text-left"> + <img src="../../images/rearrangementExample1.png" alt="Rearrangement display example 1" width='55%'> +</div><br> +This example shows a tandem duplication on CDH1 which duplicates 3 exons.</p> +<p> +<div class="text-left"> + <img src="../../images/rearrangementExample2.png" alt="Rearrangement display example 2" width='55%'> +</div><br> +This example shows a polymorphic inversion, which can be identified by the red colored sequence.</p> + <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->