Commits for angie
switch to files view, user index
v282_preview to v282_preview2 (2013-04-02 to 2013-04-09) v282
- Comparison of performance of several styles of MySQL query and sortingstrategy on several table sizes and several region sizes, for annoGrator.
Both time to first result and total time are measured. See detailed
comments at beginning of timeMysqlHandler.c for details. Upshot: we
don't need to use MySQL's HANDLER interface, but should use merge-sorting
of coarse-bin items into finest-bin items so we can start delivering
items sorted by start as soon as possible. refs #6152.
- src/hg/oneShot/timeMysqlHandler/makefile - lines changed 6, context: html, text, full: html, text
- src/hg/oneShot/timeMysqlHandler/timeMysqlHandler.c - lines changed 684, context: html, text, full: html, text
- Moving annoRowCmp from oneShot to annoRow.[ch].
- src/hg/oneShot/timeMysqlHandler/timeMysqlHandler.c - lines changed 8, context: html, text, full: html, text
- Performance improvement for annoStreamDb, based on hg/oneShot/timeMysqlHandler.c: instead of using 'SELECT ...ORDER BY' queries to ensure that output is sorted by (chrom, chromStart),
for tables with (chrom, bin) indices, use the fact that coarser bin numbers
are lower numerically than finer bin numbers (for example, with the standard
binRange scheme, the coarsest bin is 0 and the finest bins begin at 585).
For tables like snp137 with > 50M rows of tiny items, the results from a
query that uses the (chrom,bin) index typically include a very few items
from coarser bins, followed by small items from the finest bin levels that
do come out in chromStart order. So it works out well to save aside the
initial coarse-bin annoRows in a list, and then as soon as we see a row
from the finest bin level, sort the list of coarse-bin items and merge
them into the stream of finest-bin items. This way we produce coordinates
sorted by chromStart as required by annoGrator, but can start returning
rows very soon after starting the query.
For tables that don't have a (chrom,bin) index, if they have a (chrom, end)
index, tell mysql to ignore that index so it will use the (chrom, start) index
and produce correctly sorted output.
refs #6152
- src/hg/lib/annoStreamDb.c - lines changed 162, context: html, text, full: html, text
switch to files view, user index