1 Aug 2010 02:48
Off by one bug in Scipy.stats.hypergeom
Jacob Biesinger <jake.biesinger <at> gmail.com>
2010-08-01 00:48:28 GMT
2010-08-01 00:48:28 GMT
Hi!
Perhaps I'm using the module incorrectly, but it looks like the x parameter in scipy.stats.hypergeom is off by one. Specifically, I think it's one-too-high.
From the wikipedia article http://en.wikipedia.org/wiki/Hypergeometric_distribution#Application_and_example (I know they could be wrong-- just hear me out on this),
scipy.stats.hypergeom?
Hypergeometric distribution
Models drawing objects from a bin.
M is total number of objects, n is total number of Type I objects.
RV counts number of Type I objects in N drawn without replacement from
population.
So translating wikipedia's example...
Pr(x=4; M=50, n=5, N=10) = (choose(5,4) * choose(50-5, 10-4)) / choose(50,10) = .003964583
Pr(x=5; M=50, n=5, N=10) = (choose(5,5) * choose(50-5, 10-5)) / choose(50,10) = .0001189375
Which you can check with the python code:
from scipy import comb as chse # "combination" => choose
float((chse(5,4, exact=1) * chse(50-5,10-4, exact=1))) / chse(50,10,exact=1) # example one
0.0039645830580150654155 # okay!
float((chse(5,5, exact=1) * chse(50-5,10-5, exact=1))) / chse(50,10,exact=1) # example two
0.00011893749174045196247 # okay!
Try example one with scipy.stats.hypergeom:
# scipy.stats.hypergeom.sf(x, M, n, N)
scipy.stats.hypergeom.sf(4,50,5,10)
0.00011893749169422652 # correct value for x=5, not x=4
scipy.stats.hypergeom.sf(5,50,5,10)
-4.6185277824406512e-14 # wrong
It seems that changing the loc value from =0 (default) to =1 fixes the issue...
scipy.stats.hypergeom.sf(4,50,5,10, loc=1)
0.0040835205497095073 # close enough
scipy.stats.hypergeom.sf(5,50,5,10, loc=1)
0.00011893749169422652 # okay!
Am I using the package wrong?
--
Jake Biesinger
Graduate Student
Xie Lab, UC Irvine
(949) 231-7587
Jake Biesinger
Graduate Student
Xie Lab, UC Irvine
(949) 231-7587
_______________________________________________ SciPy-User mailing list SciPy-User <at> scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
RSS Feed