a2c7461148241e1514b1522aa60a200faaed24ed
lrnassar
  Wed May 1 14:22:54 2019 -0700
Adding entry to BLAT FAQ explaining BLAT All hits #23411

diff --git src/hg/htdocs/FAQ/FAQblat.html src/hg/htdocs/FAQ/FAQblat.html
index a93b8f5..b94c8e1 100755
--- src/hg/htdocs/FAQ/FAQblat.html
+++ src/hg/htdocs/FAQ/FAQblat.html
@@ -9,30 +9,31 @@
 
 <h2>Topics</h2>
 
 <ul>
 <li><a href="#blat1">BLAT vs. BLAST</a></li>
 <li><a href="#blat1b">Blat cannot find a sequence at all or not all expected matches</a></li>
 <li><a href="#blat2">BLAT use restrictions</a></li>
 <li><a href="#blat3">Downloading Blat source and documentation</a></li>
 <li><a href="#blat5">Replicating web-based Blat parameters in command-line version</a></li>
 <li><a href="#blat6">Using the <em>-ooc</em> flag</strong></a></li>
 <li><a href="#blat4">Replicating web-based Blat percent identity and score calculations</a></li>
 <li><a href="#blat7">Replicating web-based Blat &quot;I'm feeling lucky&quot; search 
 results</a></li>
 <li><a href="#blat8">Using Blat for short sequences with maximum sensitivity</a></li>
 <li><a href="#blat9">Blat ALL genomes</a></li>
+<li><a href="#blat10">Blat ALL genomes: No matches found</a></li>
 
 </ul>
 <hr>
 <p>
 <a href="index.html">Return to FAQ Table of Contents</a></p>
 
 <a name="blat1"></a>
 <h2>BLAT vs. BLAST</h2>
 <h6>What are the differences between BLAT and BLAST?</h6>
 <p>
 BLAT is an alignment tool like BLAST, but it is structured differently. On DNA, BLAT works by 
 keeping an index of an entire genome in memory. Thus, the target database of BLAT is not a set of 
 GenBank sequences, but instead an index derived from the assembly of the entire genome. By default,
 the index consists of all non-overlapping 11-mers except for those heavily involved in repeats, and 
 it uses less than a gigabyte of RAM. This smaller size means that BLAT is far more easily 
@@ -317,19 +318,54 @@
 belong to.
 </p>
 <p>
 Selecting the "Search ALL" 
 checkbox above the Genome drop-down list allows you to search the genomes
 of the default assemblies for all of our organisms. It also searches any attached hubs' 
 Blat servers, meaning you can search your user-generated assembly hubs. The results page displays an ordered list 
 of all our organisms and their homology with your query sequence. The results are ordered 
 so that the organism with the best alignment score is at the top, indicating which region(s) 
 of that organism has the greatest homology with your query sequence.
 The entire alignment, including mismatches and gaps, must <a href="../FAQ/FAQblat.html#blat4">score</a>  
 20 or higher in order to appear in the Blat output. By clicking into a link in the <em>Assembly list</em> 
 you will be taken to a new page displaying various locations and scores of sequence homology in the assembly of interest.
 </p>
 
+<a name="blat10"></a>
+<h2>Blat ALL genomes: No matches found</h2>
+<h6>My Blat ALL results display assemblies with hits, but clicking into them reports 
+no matches</h6>
 
+<p>
+In the Blat All results page, the "Hits" column does not represent alignments, instead it reports 
+tile hits. Tile hits are 11 base kmer matches found in the target, which do not necessarily 
+represent successful alignments. When one clicks the 'Assembly' link a full BLAT alignment for 
+that genome will occur and any alignment scores representing less than a 20 bp result will 
+come back as no matches found.</p>
+
+<p>
+When you BLAT a sequence, the server reads the target (genome) and builds an index in memory of 
+all the 11-mer locations. These 11-mers &quot;tile&quot; the sequence as such:
+
+<pre>
+ACTGACTGACT
+ CTGACTGACTT
+  TGACTGACTTA
+</pre></p>
 
+<p>
+After the index is built, the first step of alignment is to read the query (search) sequence, 
+extract all the 11-mers, and look those up in the genome 11-mer index currently in memory. 
+Matches found there represent the initial &quot;hits&quot; you see in the Blat All results page. 
+The next step is to look for hits that overlap or fall within a certain distance of each other, 
+and attempt to align the sequences between the hit locations in target and query.</p>
+
+<p>
+For example, if two 11-base tile hits align perfectly, it would result in a score of 22. This is 
+above the minimum required score of 20 (see <a href="#blat9">BLAT ALL genomes</a>), and would be 
+reported as an alignment. However, there are penalties for gaps and mismatches, as well as potential 
+overlap (see stepsize in <a href="../goldenPath/help/blatSpec.html">BLAT specifications</a>), all 
+of which could bring the score below 20. In that case, BLAT All would report 2 &quot;hits&quot;, 
+but clicking into the assembly would report no matches. This most often occurs when there are 
+only a few (1-3) hits reported by BLAT All.</p>
 
 <!--#include virtual="$ROOT/inc/gbPageEnd.html" -->