5667676c9a55330c839f223b9c0cfefa691390d4
max
  Thu Dec 12 02:19:09 2019 -0800
adding a section to the BLAT FAQ, refs #24595

diff --git src/hg/htdocs/FAQ/FAQblat.html src/hg/htdocs/FAQ/FAQblat.html
index f93e405..21d0067 100755
--- src/hg/htdocs/FAQ/FAQblat.html
+++ src/hg/htdocs/FAQ/FAQblat.html
@@ -76,37 +76,43 @@
 
 <a name="blat1b"></a>
 <h2>Blat can't find a sequence or not all expected matches</h2>
 <h6>I can't find a sequence with Blat although I'm sure it is in the genome. Am I doing 
 something wrong?</h6>
 <p>
 First, check if you are using the correct version of the genome. For example, two versions of the 
 human genome are currently in wide use (hg19 and hg38) and your sequence may be only in one of
 them. Many published articles do not specify the assembly version so trying both may be necessary.</p>
 <p>
 Very short sequences that go over a splice site in a cDNA sequence can't be found, as they are not 
 in the genome. qPCR primers are a typical example. For these cases, try using 
 <a href="../cgi-bin/hgPcr">In-Silico PCR</a> and selecting a gene set as the target. In general, 
 the In-Silico PCR tool is more sensitive and should be preferred for pairs of primers.</p>
 <p>
-If you have verified that you are using the correct genome and that the sequence is indeed there, 
-for example by using the <a href="../cgi-bin/hgTrackUi?db=hg38&g=oligoMatch">"Short match" track
-</a>, the problem may be a result of BLAT's query-masking. 
-This happens if your input sequence is part of a repeat and present thousands of times in the genome.
+Another problematic case are sequences in repeats, as BLAT skips the most repetitive 
+parts of the query and also limits the number of best matches it finds.
 The online version of Blat masks 11mers from the query that occur more than 1024 times in the 
-genome. This is done to improve speed, but may result in missed hits when you are searching for 
-sequences in repeats.</p>
+genome and also stops searching once it has found a certain number of optimal matches on a chromosome.
+This is done to improve speed, but can result in missed hits when you are searching for 
+sequences in repeats. In these cases, a small subset of matches is found and these
+are only part of all optimal matches in the genome. Often, you can use the self-chain track to 
+find the other matches, but only if the other matches are long enough. You can always work around
+this limitation but adding more flanking sequence to your query, to make the query unique enough.
+You can check whether any sequence is indeed present at a particular location 
+by using the <a href="../cgi-bin/hgTrackUi?db=hg38&g=oligoMatch">"Short match" track</a>. 
+</p>
+
 <p>
 If your input sequence is not one of the very repetitive sequences, but still
 present a few dozen times on a chromosome, note that Blat results are limited
 to 16 results per chromosome strand. This means that at most 32 locations
 per chromosome are returned.
 </p>
 <p>
 To find all matches for repetitive sequences with the online version of Blat, you can add more flanking sequence to your 
 query. If this is not possible, the only alternative is to download the executables of Blat and the 
 .2bit file of a genome to your own machine and use BLAT on the command line. See 
 <a href="#blat3">Downloading BLAT source and documentation</a> for more information. 
 When using the command line version of BLAT, you can set the repMatch option to a large value
 to try to improve finding matches in repetitive regions and do not
 use one of the default 11.ooc masking files.</p>