Enzo Michelangeli | 1 Oct 06:07
Picon

Re: Fast (O(n log(n)) ) implementation of Kendall Tau

From: "Sturla Molden" <sturla <at> molden.no>
Sent: Thursday, October 01, 2009 6:03 AM

> Enzo Michelangeli skrev:
>> Dear all,
>>
>> A few days ago I posted to http://projects.scipy.org/scipy/ticket/999 a
>> drop-in replacement for scipy.stats.kendalltau. My code implements the
>> algorithm with complexity O(n log()) described by William R. Knight in a
>> paper of 1966 archived at http://www.jstor.org/pss/2282833 , whereas the
>> function currently in SciPy has complexity O(n^2), which makes it 
>> unusable
>>
> There is also:
>
> http://projects.scipy.org/scipy/ticket/893
>
> It has a contigency table version that would be fast for large data
> sets, in theory O(n).

Yes, but that requires a native module.

Enzo 
Enzo Michelangeli | 1 Oct 06:07
Picon

Re: Fast (O(n log(n)) ) implementation of Kendall Tau

From: "Sturla Molden" <sturla <at> molden.no>
Sent: Thursday, October 01, 2009 6:09 AM

> Enzo Michelangeli skrev:
>> Dear all,
>>
>> A few days ago I posted to http://projects.scipy.org/scipy/ticket/999 a
>> drop-in replacement for scipy.stats.kendalltau.
> Why do you re-implement mergesort in pure Python?
>
> ndarrays have a sort method that can use mergesort.
>
> Python lists has the same (timsort is mergesort on steroids).

Because Knight's algorithm needs to count the number of swaps (or, to be
more precise, the number of swaps that would be performed by an equivalent
bubblesort). In the code, that's the purpose of the variable exchcnt .

An alternative algorithm for the Kendall Tau developed by David Christensen,
and described at http://www.springerlink.com/content/p33qu44058984082/ (with
a Delphi implementation), replaces the mergesort step with one based on
balanced binary trees (AVL in his case, but I guess RBT would also work).
Unfortunately, neither the standard Python library nor NumPy/SciPy appear to
implement such useful data structures, and what is available either doesn't
allow O(log(n)) random access (heapq) or lacks a O(log(n)) insert (sorted
lists accessed through bisect).

Enzo
Charles R Harris | 1 Oct 08:22
Picon

Re: Fast (O(n log(n)) ) implementation of Kendall Tau



On Wed, Sep 30, 2009 at 10:07 PM, Enzo Michelangeli <enzomich <at> gmail.com> wrote:
From: "Sturla Molden" <sturla <at> molden.no>
Sent: Thursday, October 01, 2009 6:09 AM

> Enzo Michelangeli skrev:
>> Dear all,
>>
>> A few days ago I posted to http://projects.scipy.org/scipy/ticket/999 a
>> drop-in replacement for scipy.stats.kendalltau.
> Why do you re-implement mergesort in pure Python?
>
> ndarrays have a sort method that can use mergesort.
>
> Python lists has the same (timsort is mergesort on steroids).

Because Knight's algorithm needs to count the number of swaps (or, to be
more precise, the number of swaps that would be performed by an equivalent
bubblesort). In the code, that's the purpose of the variable exchcnt .

An alternative algorithm for the Kendall Tau developed by David Christensen,
and described at http://www.springerlink.com/content/p33qu44058984082/ (with
a Delphi implementation), replaces the mergesort step with one based on
balanced binary trees (AVL in his case, but I guess RBT would also work).
Unfortunately, neither the standard Python library nor NumPy/SciPy appear to
implement such useful data structures, and what is available either doesn't

Yep, we could use some more of those useful data structures. A "computer science" library would be useful. The scipy.spacial library is step in that direction but it would be nice if there were more such.

Chuck

_______________________________________________
Scipy-dev mailing list
Scipy-dev <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev
Scott Sinclair | 1 Oct 10:28
Picon
Gravatar

Statsmodels documentation buglet

Hi,

I'm having a look at the statsmodels scikit and have noticed a minor
bug in the docs at:

http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.OLS.html

The definition Y = [1,3,4,5,2,3,4], in the example contains a trailing
comma that causes the example to fail when used with IPython's cpaste
mode.

Cheers,
Scott
Skipper Seabold | 1 Oct 17:01
Picon

Re: Statsmodels documentation buglet

On Thu, Oct 1, 2009 at 4:28 AM, Scott Sinclair
<scott.sinclair.za <at> gmail.com> wrote:
> Hi,
>
> I'm having a look at the statsmodels scikit and have noticed a minor
> bug in the docs at:
>
> http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.OLS.html
>
> The definition Y = [1,3,4,5,2,3,4], in the example contains a trailing
> comma that causes the example to fail when used with IPython's cpaste
> mode.
>

Thanks, should be fixed shortly.

Skipper
josef.pktd | 1 Oct 17:59
Picon

Re: Statsmodels documentation buglet

On Thu, Oct 1, 2009 at 11:01 AM, Skipper Seabold <jsseabold <at> gmail.com> wrote:
> On Thu, Oct 1, 2009 at 4:28 AM, Scott Sinclair
> <scott.sinclair.za <at> gmail.com> wrote:
>> Hi,
>>
>> I'm having a look at the statsmodels scikit and have noticed a minor
>> bug in the docs at:
>>
>> http://statsmodels.sourceforge.net/generated/scikits.statsmodels.regression.OLS.html
>>
>> The definition Y = [1,3,4,5,2,3,4], in the example contains a trailing
>> comma that causes the example to fail when used with IPython's cpaste
>> mode.
>>
>
> Thanks, should be fixed shortly.

I did it already in trunk, but got stuck with some examples. I moved
the tutorial files from the sandbox to the examples folder and will
push to launchpad soon. WLS with one of the regressors squared as
weights still looks "weird" (example_wls.py).

We haven't run the doctests in a while. However the scripts in the
example folder are always checked whether they run (not whether they
produce correct results).

Scott, any comments about the usage of statsmodels are very welcome.
(And I'm still waiting for missing bug reports.)

Josef

>
> Skipper
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
Picon
Gravatar

[ANN] SciPy India conference in Dec. 2009

Greetings,

The first "Scientific Computing with Python" conference in India
(http://scipy.in) will be held from December 12th to 17th, 2009 at the
Technopark in Trivandrum, Kerala, India (http://www.technopark.org/).

The theme of the conference will be "Scientific Python in Action" with
respect to application and teaching.  We are pleased to have Travis
Oliphant, the creator and lead developer of numpy
(http://numpy.scipy.org) as the keynote speaker.

Here is a rough schedule of the conference:

     Sat.    Dec. 12  (conference)
     Sun.    Dec. 13  (conference)
     Mon.    Dec. 14  (tutorials)
     Tues.   Dec. 15  (tutorials)
     Wed.    Dec. 16  (sprint)
     Thu.    Dec. 17  (sprint)

The tutorial sessions will have two tracks, one specifically for
teachers and one for the general public.

There are no registration fees.

Please register at:

         http://scipy.in

The call for papers will be announced soon.

This conference is organized by the FOSSEE project (http://fossee.in)
funded by the Ministry of Human Resources and Development's National
Mission on Education (NME) through Information and Communication
Technology (ICT) jointly with SPACE-Kerala
(http://www.space-kerala.org).

Regards,
Prabhu Ramachandran and Jarrod Millman
Scott Sinclair | 2 Oct 08:04
Picon
Gravatar

Re: Statsmodels documentation buglet

>2009/10/1  <josef.pktd <at> gmail.com>:
> Scott, any comments about the usage of statsmodels are very welcome.
> (And I'm still waiting for missing bug reports.)

Just poking around and finding it pretty useable so far. Happy to
report any bugs if I find them.

Cheers,
Scott
Arkapravo Bhaumik | 2 Oct 14:24
Picon
Gravatar

Volunteer for Scipy Project


Dear Sir/Ma'am

I was in e-mail contact with one of your colleagues , Travis ;  I am very interested in contributing to scipy project as a volunteer. I believe that I can contribute in

(1) Documentation
(2) Suggesting and developing newer functionality
(3) Study similar software as Matlab, Mathematica, Maple etc trying to look for inspiration for possible improvements in scipy

Some of my dabbling python are discussed in my blogspot, http://programming-unlimited.blogspot.com/ and I am eager to be a part of scipy : a revolution in the making.

I look forward to your reply.

Kind regards

Arkapravo

_______________________________________________
Scipy-dev mailing list
Scipy-dev <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev
Arkapravo Bhaumik | 2 Oct 14:32
Picon
Gravatar

Volunteer for Scipy Project


Dear Sir/Ma'am

I was in e-mail contact with one of your colleagues , Travis ;  I am very interested in contributing to scipy project as a volunteer. I believe that I can contribute in

(1) Documentation
(2) Suggesting and developing newer functionality
(3) Study similar software as Matlab, Mathematica, Maple etc trying to look for inspiration for possible improvements in scipy

Some of my dabbling python are discussed in my blogspot, http://programming-unlimited.blogspot.com/ and I am eager to be a part of scipy : a revolution in the making.

I look forward to your reply.

Kind regards

Arkapravo

_______________________________________________
Scipy-dev mailing list
Scipy-dev <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev

Gmane