Commits for angie
switch to files view, user index
v251_preview2 to v251_base (2011-05-03 to 2011-05-10) v251
- Feature #3711 (center-weighted alpha haplo sorting for vcfTabix):Performance improvement for hacTree: if caller passes in comparison
function, then pre-sort the items and pre-cluster adjacent identical
items before generating pairs. When many of the inputs are identical,
this greatly reduces the number of pairs that the main clustering
algorithm starts with. For example, a 1000 Genomes file has
genotypes for 1360 people (2720 haplotypes), and starting with
all pairs of 2720 haps was impossibly slow for hgTracks. However,
in regions of a few tens of thousands of bases and a few tens of
variants, in practice there's usually less than 100 distinct
haplotypes, which makes it possible to cluster in tenths of seconds
instead of timing out. The pre-clustering also makes nice balanced
trees; the main clustering step still seems prone to chaining to me,
so there's probably still more room for improvement there.
- src/hg/hgTracks/vcfTrack.c - lines changed 168, context: html, text, full: html, text
- src/lib/tests/expected/hacTreeTest.out - lines changed 63, context: html, text, full: html, text
- src/lib/tests/hacTreeTest.c - lines changed 11, context: html, text, full: html, text
- src/lib/tests/input/hacTreeTest.txt - lines changed 20, context: html, text, full: html, text
- Fix for warning message produced only when -O is used: compiler thinksa variable might be used uninitialized, although it is initialized in
all if/else cases. Thanks Tim for catching that!
- Code Review #3822: Added long explanatory comment for main clusteringstep based on Jim's suggestions. In the process, I realized that I'm
using a pool not a true heap, so I changed variable names and comments
accordingly.
- Feature #2823 (VCF track handler): removing some code that won't be used.
- src/hg/hgTracks/vcfTrack.c - lines changed 28, context: html, text, full: html, text
- Feature #3711 (vcfTabix haplotype clustering): added pgSnp-like mouseovertext, but with genotype counts instead of allele counts.
- src/hg/hgTracks/vcfTrack.c - lines changed 42, context: html, text, full: html, text
- pgSnp's extra mapbox over the bases covered only the bottom one;tweaked it to cover both top and bottom.
- src/hg/hgTracks/simpleTracks.c - lines changed 1, context: html, text, full: html, text
switch to files view, user index