e0d1ba5cd58fcaa09e80572312745343a8f7aa1c dschmelt Wed Oct 16 17:35:38 2019 -0700 Correcting mistaken stepSize diagram No RM diff --git src/hg/htdocs/FAQ/FAQblat.html src/hg/htdocs/FAQ/FAQblat.html index 787a50b..a5991f2 100755 --- src/hg/htdocs/FAQ/FAQblat.html +++ src/hg/htdocs/FAQ/FAQblat.html @@ -361,37 +361,37 @@

Blat ALL genomes: No matches found

My Blat ALL results display assemblies with hits, but clicking into them reports no matches

In the Blat All results page, the "Hits" column does not represent alignments, instead it reports tile hits. Tile hits are 11 base kmer matches found in the target, which do not necessarily represent successful alignments. When one clicks the 'Assembly' link a full BLAT alignment for that genome will occur and any alignment scores representing less than a 20 bp result will come back as no matches found.

-When you BLAT a sequence, the server reads the target (genome) and builds an index in memory of -all the 11-mer locations. These 11-mers "tile" the sequence as such: +When you BLAT All a sequence, the server reads the target (genome) and builds an index in memory of +all the 11-mer locations, with an 11bp default stepSize. These 11-mers "tile" the sequence as such:

-ACTGACTGACT
- CTGACTGACTT
-  TGACTGACTTA
+TGGACAACATG
+           GCAAGAATCAG
+                      TCTCTACAGAA
 

After the index is built, the first step of alignment is to read the query (search) sequence, extract all the 11-mers, and look those up in the genome 11-mer index currently in memory. Matches found there represent the initial "hits" you see in the Blat All results page. The next step is to look for hits that overlap or fall within a certain distance of each other, and attempt to align the sequences between the hit locations in target and query.

For example, if two 11-base tile hits align perfectly, it would result in a score of 22. This is above the minimum required score of 20 (see BLAT ALL genomes), and would be reported as an alignment. However, there are penalties for gaps and mismatches, as well as potential overlap (see stepsize in BLAT specifications), all of which could bring the score below 20. In that case, BLAT All would report 2 "hits",