b95ff3835509b242bd9007ba55c4f60c1022da47
markd
  Thu Dec 30 14:58:16 2010 -0800
moved to programs to hg/utils, fixed build of distributed utilities
diff --git src/hg/overlapSelect/todo.txt src/hg/overlapSelect/todo.txt
deleted file mode 100644
index 2eec8fb..0000000
--- src/hg/overlapSelect/todo.txt
+++ /dev/null
@@ -1,45 +0,0 @@
-
-- add correlation coefficient as a criteria:
-    Adam Siepel <acs@soe.ucsc.edu> 2005/03/02
-    On another front, it occurred to me that the correlation coefficient 
-    sometimes used in gene prediction stats could be another useful thing 
-    to report.  For each inFile record, you could give a correlation 
-    coefficient based on all overlapping selectFile records.  This would 
-    give you one number saying something about both directions of coverage 
-    and about the degree of "consistency" we were talking about the other 
-    day.  For example, you could project the intronEsts into a bed of 
-    nonoverlapping features using featureBits, then run overlapSelect -cc 
-    (or similar), to get a cc number for each selectFile, which could then 
-    go in a database like the one I'm building.  I think, when computing 
-    the cc, you might want to limit yourself to the range of the inFile 
-    record.  That would make sense for my application at least, where my 
-    predictions are fragments.  In other cases, you might want to compute 
-    the cc for the smallest interval including both the inFile record and 
-    all overlapping selectFile records.
-
-    It looks to me like the number you'd compute for a given interval would 
-    be
-            cc =  (cN - ab) / sqrt(ab(N-a)(N-b))
-
-    where a is the number of "bits" (e.g., bases in exons) in the inFile 
-    record, b is the total number of bits in all overlapping selectFile 
-    records (within the interval), c is the number of bits in both the 
-    inFile and the selectFile records, and N is the length of the interval.
-
-    For example, if you had a 1000 base interval with 100 bases within 
-    predicted exons, 150 bases of supporting EST evidence, and an overlap 
-    of 90 bases, then N = 1000, a = 100, b = 150, c = 90, and cc = 0.70.
-
-    This number is defined as long as 0 < a,b < N.  It will always be true 
-    that a > 0 (otherwise you don't have an inFile record).  If a > 0 and b 
-    = 0, then you'd have 0/0 but you could just report 0.  If a = N and b 
-    <= N (also possible), then c = b and you'd also have 0/0.  You could 
-    report b/N in this case.  The symmetric thing could be done if b = N 
-    and a <= N.
-
-    I suppose an alternative would be to report a, b, c, and N for each 
-    inFile record.  Then the cc or some alternative could easily be 
-    computed with an awk script.
-
-- add featureBits type of feature specifications (e.g. :intron)
-