Fabian Pedregosa | 1 Feb 10:28
Picon
Picon
Favicon
Gravatar

Re: Ball Tree code updated (ticket 1048)

Jake VanderPlas wrote:
> Hello,
> I have had comments from a few people over the last two months on the
> Ball Tree code that I submitted (ticket 1048).  I cleaned up the code
> a bit and posted the changes on the tracker.  Any other comments would
> be appreciated!
>    -Jake
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
> 

Hi! Your code works great, with impressive speed improvements in high 
dimensions. I could just find a minor thing: leaf_size in the docstring 
should read leafsize.

If this turns out to be too specific for scipy, would you mind that it 
get's included in scikit.learn [1] (BSD-license)?

Thanks,

fabian

[1] http://scikit-learn.sourceforge.net
David Cournapeau | 4 Feb 07:27
Picon

Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)

Hi,

I have played a bit with adding support for computing a few eigenvalues 
of full symmetric matrices. I would like some comments on the current API:

import numpy as np
from scipy.linalg import eigs
x = np.random.randn(10, 10)
x = np.dot(x.T, x)
# Retrieve the 3 biggest eigenvalues
eigs(x, 3)[0]
# Retrieve the 3 smallest eigenvalues
eigs(x, -3)[0]
# Retrieve the 3 biggest eigenvalues
eigs(x, [0, 3], mode="index")[0]
# Retrieve the 2nd and 3rd biggest
eigs(x, [1, 3], mode="index")[0]
# Retrieve all the eigenvalues in the range [1.5, 3.5[
eigs(x, [1.5, 3.5], mode="range")[0]

One thing which does not feel right is that that the range in the 
"index" mode is exactly inverted compared to the output (i.e. if you ask 
for the range [0, 3], you get the last three items from what you would 
get if you asked for the full range [0, 10]), but this is because I kept 
compatibility with Octave (always showing from biggest to smallest). It 
would be easy to always do the contrary (from smallest to biggest) - 
besides consistency, it has the advantages that it is the actual order 
you get back from the underlying LAPACK function.

Also, I needed to modify the f2py file for the related LAPACK functions, 
(Continue reading)

Charles R Harris | 4 Feb 07:37
Picon

Re: Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)



On Wed, Feb 3, 2010 at 11:27 PM, David Cournapeau <david <at> silveregg.co.jp> wrote:
Hi,

I have played a bit with adding support for computing a few eigenvalues
of full symmetric matrices. I would like some comments on the current API:

import numpy as np
from scipy.linalg import eigs
x = np.random.randn(10, 10)
x = np.dot(x.T, x)
# Retrieve the 3 biggest eigenvalues
eigs(x, 3)[0]
# Retrieve the 3 smallest eigenvalues
eigs(x, -3)[0]
# Retrieve the 3 biggest eigenvalues
eigs(x, [0, 3], mode="index")[0]
# Retrieve the 2nd and 3rd biggest
eigs(x, [1, 3], mode="index")[0]
# Retrieve all the eigenvalues in the range [1.5, 3.5[
eigs(x, [1.5, 3.5], mode="range")[0]

One thing which does not feel right is that that the range in the
"index" mode is exactly inverted compared to the output (i.e. if you ask

Why not use separate keywords for index and range. Changing the meaning of an argument using another keyword is just weird. I've seen it used elsewhere, but still... At some point you might even want the largest three in a range. So something like

eigs(x, index=3)
eigs(x, range=[1,4])

Why the index on the end? Is the return a list?
 
for the range [0, 3], you get the last three items from what you would
get if you asked for the full range [0, 10]), but this is because I kept
compatibility with Octave (always showing from biggest to smallest). It

Why follow Octave, isn't Octave like Matlab? Matlab functions are a mess.

Chuck

_______________________________________________
SciPy-Dev mailing list
SciPy-Dev <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev
David Cournapeau | 4 Feb 07:49
Picon

Re: Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)

Charles R Harris wrote:
> 
> 
> On Wed, Feb 3, 2010 at 11:27 PM, David Cournapeau <david <at> silveregg.co.jp 
> <mailto:david <at> silveregg.co.jp>> wrote:
> 
>     Hi,
> 
>     I have played a bit with adding support for computing a few eigenvalues
>     of full symmetric matrices. I would like some comments on the
>     current API:
> 
>     import numpy as np
>     from scipy.linalg import eigs
>     x = np.random.randn(10, 10)
>     x = np.dot(x.T, x)
>     # Retrieve the 3 biggest eigenvalues
>     eigs(x, 3)[0]
>     # Retrieve the 3 smallest eigenvalues
>     eigs(x, -3)[0]
>     # Retrieve the 3 biggest eigenvalues
>     eigs(x, [0, 3], mode="index")[0]
>     # Retrieve the 2nd and 3rd biggest
>     eigs(x, [1, 3], mode="index")[0]
>     # Retrieve all the eigenvalues in the range [1.5, 3.5[
>     eigs(x, [1.5, 3.5], mode="range")[0]
> 
>     One thing which does not feel right is that that the range in the
>     "index" mode is exactly inverted compared to the output (i.e. if you ask
> 
> 
> Why not use separate keywords for index and range.

Then I am not sure how to handle the "give me the k biggest/smallest" 
case, which is the most common I think (that's the only one I care 
personally :) ).

Maybe this just warrants several functions ? But then there is the issue 
of naming (supporting non-symmetric matrices would be nice, but requires 
a totally different implementation, as LAPACK does not support it AFAIK 
- the easiest way would be to use ARPACK ATM).

> Changing the meaning 
> of an argument using another keyword is just weird. 

Agreed - this is just the best I could came up with one function to 
support all cases while keeping the common case simple. But I feel a bit 
like I outsmarted myself here.

> Why the index on the end? Is the return a list?

Yes, it also returns eigenvectors

> 
>     for the range [0, 3], you get the last three items from what you would
>     get if you asked for the full range [0, 10]), but this is because I kept
>     compatibility with Octave (always showing from biggest to smallest). It
> 
> 
> Why follow Octave, isn't Octave like Matlab? Matlab functions are a mess.

Actually, there is another reason I forgot to mention: that's how eigen 
in scipy.sparse does as well (in decreasing order). I think that the 
ARPACK wrappers are not that well thought out, though.

cheers,

David
Charles R Harris | 4 Feb 08:06
Picon

Re: Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)



On Wed, Feb 3, 2010 at 11:49 PM, David Cournapeau <david <at> silveregg.co.jp> wrote:
Charles R Harris wrote:
>
>
> On Wed, Feb 3, 2010 at 11:27 PM, David Cournapeau <david <at> silveregg.co.jp
> <mailto:david <at> silveregg.co.jp>> wrote:
>
>     Hi,
>
>     I have played a bit with adding support for computing a few eigenvalues
>     of full symmetric matrices. I would like some comments on the
>     current API:
>
>     import numpy as np
>     from scipy.linalg import eigs
>     x = np.random.randn(10, 10)
>     x = np.dot(x.T, x)
>     # Retrieve the 3 biggest eigenvalues
>     eigs(x, 3)[0]
>     # Retrieve the 3 smallest eigenvalues
>     eigs(x, -3)[0]
>     # Retrieve the 3 biggest eigenvalues
>     eigs(x, [0, 3], mode="index")[0]
>     # Retrieve the 2nd and 3rd biggest
>     eigs(x, [1, 3], mode="index")[0]
>     # Retrieve all the eigenvalues in the range [1.5, 3.5[
>     eigs(x, [1.5, 3.5], mode="range")[0]
>
>     One thing which does not feel right is that that the range in the
>     "index" mode is exactly inverted compared to the output (i.e. if you ask
>
>
> Why not use separate keywords for index and range.

Then I am not sure how to handle the "give me the k biggest/smallest"
case, which is the most common I think (that's the only one I care
personally :) ).


That would be eigs(x, index=3) or eigs(x, index=-3) respectively, the default value of both range and index would be None, which could possibly return all eigenvalues. I'm not sure that index is the best word but I can't think of a better at the moment.

<snip>

Chuck

_______________________________________________
SciPy-Dev mailing list
SciPy-Dev <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev
Picon
Picon

Re: Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)

David Cournapeau wrote:
> Charles R Harris wrote:
>>
>> On Wed, Feb 3, 2010 at 11:27 PM, David Cournapeau <david <at> silveregg.co.jp 
>> <mailto:david <at> silveregg.co.jp>> wrote:
>>
>>     Hi,
>>
>>     I have played a bit with adding support for computing a few eigenvalues
>>     of full symmetric matrices. I would like some comments on the
>>     current API:
>>
>>     import numpy as np
>>     from scipy.linalg import eigs
>>     x = np.random.randn(10, 10)
>>     x = np.dot(x.T, x)
>>     # Retrieve the 3 biggest eigenvalues
>>     eigs(x, 3)[0]
>>     # Retrieve the 3 smallest eigenvalues
>>     eigs(x, -3)[0]
>>     # Retrieve the 3 biggest eigenvalues
>>     eigs(x, [0, 3], mode="index")[0]
>>     # Retrieve the 2nd and 3rd biggest
>>     eigs(x, [1, 3], mode="index")[0]
>>     # Retrieve all the eigenvalues in the range [1.5, 3.5[
>>     eigs(x, [1.5, 3.5], mode="range")[0]
>>
>>     One thing which does not feel right is that that the range in the
>>     "index" mode is exactly inverted compared to the output (i.e. if you ask
>>
>>
>> Why not use separate keywords for index and range.
> 
> Then I am not sure how to handle the "give me the k biggest/smallest" 
> case, which is the most common I think (that's the only one I care 
> personally :) ).
> 
> Maybe this just warrants several functions ? But then there is the issue 
> of naming (supporting non-symmetric matrices would be nice, but requires 
> a totally different implementation, as LAPACK does not support it AFAIK 
> - the easiest way would be to use ARPACK ATM).
> 
>> Changing the meaning 
>> of an argument using another keyword is just weird. 
> 
> Agreed - this is just the best I could came up with one function to 
> support all cases while keeping the common case simple. But I feel a bit 
> like I outsmarted myself here.
> 
>> Why the index on the end? Is the return a list?
> 
> Yes, it also returns eigenvectors
> 
>>     for the range [0, 3], you get the last three items from what you would
>>     get if you asked for the full range [0, 10]), but this is because I kept
>>     compatibility with Octave (always showing from biggest to smallest). It
>>
>>
>> Why follow Octave, isn't Octave like Matlab? Matlab functions are a mess.
> 
> Actually, there is another reason I forgot to mention: that's how eigen 
> in scipy.sparse does as well (in decreasing order). I think that the 
> ARPACK wrappers are not that well thought out, though.

I feel there's also a certain convention of listing the largest 
eigenvalues first; "let \lambda_i be the eigenvalues...assume \lambda_2 
< \lambda_1 ...".

I know I'd assume that the largest eigenvalue came out first, and I've 
barely used either MATLAB or Octave.

(Also with SVD one would typically put the largest singular values 
first, and so on.)

Just a data-point, I don't really care either way...

--

-- 
Dag Sverre
Sturla Molden | 4 Feb 11:40
Picon

Re: compilation with fort77


>>
>>
>>
>> the problem is that fort77 cannot deal with those variable sized
>> input arguments:
>>
>>      SUBROUTINE mvnun(d, n, lower, upper, means, covar, maxpts,
>>     &                   abseps, releps, value, inform)
>> ...
>>      integer n, d, infin(d), maxpts, inform, tmpinf
>>      double precision lower(d), upper(d), releps, abseps,
>>     &                 error, value, stdev(d), rho(d*(d-1)/2),
>>     &                 covar(d,d),
>>     &                 nlower(d), nupper(d), means(d,n), tmpval
>>      integer i, j
>>
>> Could some Fortran expert please help me to make it fort77  
>> compatible?

The code posted here (I haven't looked in SVN) looks like valid  
Fortran 77 to me. Dummy arguments can have variable size. f2c might  
not accept it because variable-size arrays are not supported in C89,  
and the Fortran to C conversion is rather primitive.

Sturla

>
> This file is using some F90/F95 - I am not sure what the policy is on
> this point, but don't think we want to guarantee that we will never  
> use
> fortran > F77.
>
> Can't you get gfortran running on your platform ? Cross-compiling
> gfortran  for your platform should not be difficult
>
> David
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
Pauli Virtanen | 4 Feb 12:30
Picon
Picon
Favicon

Re: compilation with fort77

Thu, 04 Feb 2010 11:40:52 +0100, Sturla Molden wrote:
>>> the problem is that fort77 cannot deal with those variable sized input
>>> arguments:
>>>
>>>      SUBROUTINE mvnun(d, n, lower, upper, means, covar, maxpts,
>>>     &                   abseps, releps, value, inform)
>>> ...
>>>      integer n, d, infin(d), maxpts, inform, tmpinf double precision
>>>      lower(d), upper(d), releps, abseps,
>>>     &                 error, value, stdev(d), rho(d*(d-1)/2), &       
>>>              covar(d,d),
>>>     &                 nlower(d), nupper(d), means(d,n), tmpval
>>>      integer i, j
>>>
>>> Could some Fortran expert please help me to make it fort77 compatible?
> 
> The code posted here (I haven't looked in SVN) looks like valid Fortran
> 77 to me. Dummy arguments can have variable size. f2c might not accept
> it because variable-size arrays are not supported in C89, and the
> Fortran to C conversion is rather primitive.

Note that infin, stdev, rho, nlower, and nupper are not dummy variables, 
so it's not valid F77.

--

-- 
Pauli Virtanen
David Warde-Farley | 4 Feb 20:31
Picon
Favicon
Gravatar

Re: Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)

On 4-Feb-10, at 3:12 AM, Dag Sverre Seljebotn wrote:

> I feel there's also a certain convention of listing the largest
> eigenvalues first; "let \lambda_i be the eigenvalues...assume  
> \lambda_2
> < \lambda_1 ...".

+1, though I've seen them listed from smallest to largest as well.

David
Rob Clewley | 4 Feb 21:01
Picon

Re: Comments on API for Matlab's eigs equivalent (computing a few eigenvalues only)

On Thu, Feb 4, 2010 at 2:31 PM, David Warde-Farley <dwf <at> cs.toronto.edu> wrote:
> On 4-Feb-10, at 3:12 AM, Dag Sverre Seljebotn wrote:
>
>> I feel there's also a certain convention of listing the largest
>> eigenvalues first; "let \lambda_i be the eigenvalues...assume
>> \lambda_2
>> < \lambda_1 ...".
>
> +1, though I've seen them listed from smallest to largest as well.
>
> David

I'm +1 too, but are these going to be ordered by norm of the
eigenvalue if they are complex, or by the real part only? I suggest by
norm, and the docstring needs to make the choice clear. This
functionality has to be consistent for all uses of eig()!

-Rob

Gmane