4af1d9b292696e0f7b5d84aaaed8fdd0c0ad3e87
Merge parents aecee4d a3597c1
kent
  Fri Oct 26 16:04:37 2012 -0700
Resolving conflicts in blat version bump.
diff --cc src/blat/version.doc
index c43df78,23dd5a1..12b95e9
--- src/blat/version.doc
+++ src/blat/version.doc
@@@ -1,230 -1,232 +1,232 @@@
 -
++36: 
+      o (in 35x1) added repMatch default values for tileSizes 16 to 18 in genoFind.c
  
  35:
       o (in 34x1) Making total query output reporting a 64 bit number to avoid 
         overflow when people using more than 4 gig of query sequence.
       o (in 34x2) Fixed -out=blast to use +/- instead of -/+ for non-translated.
       o (in 34x3) Fixed -minScore, filter was not working when over half query-size.
       o (in 34x4) Made it convert u's to t's for RNA sequence stuff.
       o (in 34x5) Made gfServer calculate repMatch based on stepSize/tileSize combination the way blat does
         rather than just being good for stepSize 11.
       o (in 34x6) Fixed negative strand pcr psl output
       o (in 34x7) Made it check and error out if the same name is reused in the target database.
       o (in 34x8) Truncate in output PCR primers that dangle off target chrom ends.
       o (in 34x9) If -fastMap is used, check that query sizes do not exceed MAXSINGLEPIECESIZE.
       o (in 34x10) Added IP-logging option to gfServer.
       o (in 34x11) Fixed segfault when maxIntron <= 11 and tileSize <= 11.
       o (in 34x12) Fixed segfault when -fine -minIdentity=0 -minScore=0.
       o (in 34x13) out=sim4: Now filters on -minIdentity, and has alignments output in the right order.
       
  
  34:
       o (in 33x1) Put in check for two bit files with more than 4 gig.
       o (in 33x2) Allowing -maxGap in gfServer.  (Forgot to add it to
         options table.)
       o (in 33x3) Making -maxIntron work better.
       o (in 33x4) Fixed very rare bug that manifested doing protein/translated
         DNA alignments in gfClient (not blat) in some frames when the
         alignment extended all the way to the end of the DNA.
       o (in 33x4) Support .nib and .2bit:seq[:start-end] file types for query.
       o (in 33x5) Making polyA trimming leave two bases so as not to mess up
         stop codon in rna's that just contain coding region.
       o (in 33x6) Improving usage message on repMatch.
       o (in 33x7) Making blat not abort with error on empty target database.
       o (in 33x8) Fixing bug in translated DNA/protein case where a missing x3
         in the ffAli to chain caused a disagreement between the chainer
         and the rest of the program on gap scores, sometimes causing 
         good alignments to be thrown out.
  
  33:
       o Fixed problem with out=blast8 and out=blast9 output using
         dnax or rnax options where coordinates on the minus strand
         were given relative to end of sequence rather than beginning.
       o Made translated blast output of protein vs. DNA give DNA
         (multiplied by three) coordinates for both start and end
         of DNA side of match.
       o Making translated blast output work better (Fan's work).
       o Improving splice sites handling a little in a case that
         effects about 1 in 500 alignments.
       o Fixed a divide by zero error that happened with minScore=0
         in a few rare cases involving sequences with rows of N's.
       o Added checking of command line so that misspelled options
         result in an error message instead of being silently
         ignored.
  
  32:
       o Dealing with out of order exons in a more sophisticated way
         so as to fix another problem introduced by last change
         in v29.
       o Fixed problem in gfClient where it was only looking at top
         three chains in a region as opposed to the top 16 chains that
         standalone blat finds.  Gunnar at Decode found a case,
         AK090418 on 8p23.1 where this caused two good alignments to
         be dropped.
       o The banded extension phase used when the -fine flag is set
         now uses affine rather than fixed gap penalties.
       o PolyA trimming now accommodates a small amount of sequencing
         error.
       o Fixing another way that you could get out of order exons.
         (This came up aligning zebrafish bacEnd sequences to the
         zebrafish genome.)
  
  31:
       o Fixed a pretty bad memory leak when doing blast type output.
         Also fixed a more subtle one that occured in some cases usually
         involving repeat-rich query sequences.
       o Fixed bug where blat of protein vs. translated DNA
         -q=prot -t=dnax would drop exons in some situations.  This
         was a side effect of the last change in version 29.  
         Slippery stuff this!
  
  30:
       o Fixed bug in webBlat when using "Blat's Guess" and no
         sequence name.
  
  29:
       o Created webBlat, a CGI script that blats and nicely displays the
         results without requiring the whole UCSC browser and database to
         work.
       o Made -out=blast9 and -out=blast8 report e values in format
         that will print 1.3e-123 instead of 0.00000.
       o Made non-psl output for gfClient show the chromosome name
         rather than the chromosome nib/2bit file name and subsequence.
       o Adding -stepSize to gfServer and blat, so that can have
         tiles overlap.
       o Adding PCR handling to gfServer.  (Still need a separate
         pcr client.)
       o Put in some code to remove out of order exons that were generated
         in some rare situations.  This was a side effect of the second change
         in version 26.
  
  28:
       o Blat, gfClient, and gfServer now work with .2bit files.
         Like .nib files, .2bit files are a compressed binary
         random access format for DNA.  There are two advantages
         to the .2bit files over .nibs.  Most importantly a single
         .2bit file can hold multiple sequence records.  Also .2bit
         files as the name implies, use just 2 bits per DNA base
         rather than the 4 bits/base found in a .nib file.  The .2bit
         files still do allow for soft-masking by lower case, and
         for N characters.  They do this by storing lists of masked
         areas and N's separately from the sequence.  There are also
         faToTwoBit and twoBitToFa utilities to convert in and out of
         this format.
       o Making minScore work with other alignment types than psl.
  
  27:
       o Made it so that gfClient does not abort when gfServer runs out
         of memory on a request.  Instead it just prints a warning
         message.
  
  26:
       o fixed pslPretty to allow '--' strands in axt output format
       o fixed -long flag in pslPretty which was swapping target and
         query sequence on long inserts
       o fixed bug that was causing blat to drop an exon in around
         1% of genes.
       o put in error message when faToNib is used on fa files with
         more than one record.
  
  25:
       o Made it so that gfServer does not abort when it runs out
         of memory.
       o Fixed bug in pslPretty in case where there is a small
         gap in both query and target sides of alignment.
       o Fixed bug in out=blast or out=wublast where in some
         cases negative strand alignments did not line up
         properly.
       o Fixed crash when aligning translated sequences of less than 3 bases.
       o Added extendThroughN flag.
  
  24:
       o Tweaked gap penalty in trimFlakyEnds, which should make it so
         that with the -fine flag you get less small terminal exons with bad
         splice sites way away from the rest of the alignment.
       o Fixing a memory leak.
       o Adding sim4 compatable output.
       o Adding tabular blast compatable output (blast -m 8 and -m 9) as
         -out=blast8 and -out=blast9.
  
  23:
       Made -maxIntron a command line option.  Default is 750000.
  
  22:
       Fixed bug in jiggle-small exons that had reverse strand
       consensus as ca/ct rather than ac/ct.  Thanks to Par
       Engstrom for finding this.
  
  21:
       Added -fine parameter.  This will result in improved alignments
       for high quality mRNA sequences.  When -fine is used a banded
       Smith-Waterman is used in the extension phase,  and the program
       looks harder in various ways for small initial or terminal
       exons.   Currently I don't recommend using -fine on EST
       sequences, as EST ends tend to be of low quality,  and the
       short additional matches it finds are as likely to be by
       chance as real hit.  For fully sequenced mRNAs though the -fine
       option is an advantage.  It is only available currently in
       blat, not in gfClient.   
  
  20:
       o  Recompiled without -o3 in Linux.  This recent change to a makefile
          ended up causing problems.
  19: 
       o  Bumping max size of intron that gets stitched to 750kb from 500kb.
       o  Handle a rare case that could lead to zero sized exons.
       o  Handled a rare case that could lead to backtracking.
       o  Switched blat from cgiSpoof to optionHash so that it can
          be called from a web page more easily.
       o  Adding -fastMap option for very fast high similarity DNA
          searches (blat only, not gfServer).
       o  Making it able to take subsets of a nib file as input.
       o  Incorperating Arnaldur's improvement where clumps are
          sorted by amount of query sequence covered rather than
  	by hit count.  This will help gfServer/gfClient performance
  	in particular on sequences containing tandem repeats.
       o  added maxAaSize and maxNtSize options to gfServer command line.
  18: 
       o  Replaced super-stitcher dynamic program that generates 
          a table of size N*N times the number of blocks
          with a kdTree based program one that is much more space efficient 
          and a bit faster, especially in the worse cases.  This eliminates
          the 'warning trimming xxxx blocks to 7000' messages and
          fixes a crash bug manifesting specifically on the alpha
          archetecture as well.  It still uses the old routine when the
  	number of blocks is less than 10.
       o  Made it so that it will find perfect 20-mers with tileSize=10.
       o  Fixed bug where it wasn't closing genome files, resulting in
          it only being able to process 250 files on the genome side
  	on most systems.
       o  Fixed bug in sort gfHitSort2.  I'm amazed blat has worked
          as well as it has as long as it has with this bug!
       o  Put a 'Database: ' line at the end of the blast output to make
          bioperl happy.
       o  Fixed bug where you had to do -oneOff=1 rather than -oneOff
          in command line of blat.  
       o  Clarified makeOoc usage message.
       o  Made it handle fasta files with more than 64k of sequence
          in a single line.  
  17 - Added -minScore -minIdentity and -prot command line options
       to gfClient.  gfClient uses same protein default minScore as 
       blat does. Made -mask=lower work with nib files in blat, 
       and -mask do the same thing in gfServer (though gfClient
       still won't fill in repMatches separately from matches).
       Made 'stdin' and 'stdout' work properly as parameters in
       place of files in the blat command line.
  16 - Fixed case where program would abort with 'Negative gap size'
       message on some legitimate protein/translated alignments in modes
       using blast, wublast, axt, or maf output.   Fixed bug where
       nibFrag on minus strand would return usage message rather than running.
       Made blat store 'chr1' rather than '/long/path/to/chr1.nib' in
       name fields when input is a nib file.   Made pslPretty handle
       record separator lines of up to 1023 characters rather than 511.
  15 - Blast output seems to work.  Also wublast, axt and maf output.
  14 - Starting to hack in blast output format. Added -trimHardA flag.
  13 - Made things so that BLAT could find perfect 21mer matches
       fairly reliably.
  12 - Put in fix from Darren Platt that cured crashes on
       protein/translated DNA alignments on Solaris.
  11 - Fixed a bug that led to translated protein alignments coming
       back in multiple pieces.  This also caused small exons
       to sometimes be dropped.  Added -noTrimA flag