Johannes Kulick | 21 Jul 11:06 2014

Pull Request: Dirichlet Distribution


I sent a pull request, that implements a Dirichlet distribution. Code review
would be appreciated!

Johannes Kulick


Question: What is the weird attachment to all my emails?
SciPy-Dev mailing list
SciPy-Dev <at>
Moritz Emanuel Beber | 21 Jul 10:09 2014

computing pairwise distance of vectors with missing (nan) values

Dear all,

My basic problem is that I would like to compute distances between vectors with missing values. You can find more detail in my question on SO ( Since it seems this is not directly possible with scipy at the moment, I started to Cythonize my function. Currently, the below function is not much faster than my pure Python implementation, so I thought I'd ask the experts here. Note that even though I'm computing the euclidean distance, I'd like to make use of different distance metrics.

So my current attempt at Cythonizing is:

import numpy
cimport numpy
cimport cython
from numpy.linalg import norm


<at> cython.boundscheck(False)
<at> cython.wraparound(False)
def masked_euclidean(numpy.ndarray[numpy.double_t, ndim=2] data):
    cdef Py_ssize_t m = data.shape[0]
    cdef Py_ssize_t i = 0
    cdef Py_ssize_t j = 0
    cdef Py_ssize_t k = 0
    cdef numpy.ndarray[numpy.double_t] dm = numpy.zeros(m * (m - 1) // 2, dtype=numpy.double)
    cdef numpy.ndarray[numpy.uint8_t, ndim=2, cast=True] mask = numpy.isfinite(data) # boolean
    for i in range(m - 1):
        for j in range(i + 1, m):
            curr = numpy.logical_and(mask[i], mask[j])
            u = data[i][curr]
            v = data[j][curr]
            dm[k] = norm(u - v)
            k += 1
    return dm

Maybe the lack of speed-up is due to the Python function 'norm'? So my question is, how to improve the Cython implementation? Or is there a completely different way of approaching this problem?

Thanks in advance,
SciPy-Dev mailing list
SciPy-Dev <at>
Alexander Behringer | 18 Jul 11:53 2014

Is Brent's method for minimizing the value of a function implemented twice in SciPy?


while studying the SciPy documentation, I noticed that the 'brent' and
the 'fminbound' function in the 'scipy.optimize' package both seem to
implement Brent's method for function minimization.

Both functions have been implemented by Travis Oliphant (see commit
infos below).

One minor difference is, that the 'brent' function _optionally_ allows
for auto bracketing via the help of the 'bracket' function, when
supplied only with two bounds via the 'brack' parameter instead of a
triplet as required by Brent's algorithm.

So is it possible, that Brent's method has been implemented twice?

'fminbound' was added in 2001:

'brent' was added approximately three-quarters of a year later in 2002:

The 'brent' code has later been moved into a separate internal class:

Alexander Behringer
Yoshiki Vazquez Baeza | 16 Jul 20:06 2014

Adding Procrustes to SciPy


There seems to be some interest in adding Procrustes analysis to SciPy,
there is an existing implementation in scikit-bio (
a package in which I am a developer) which could probably be ported

The thing that's not particularly clear is where should this code live,
the suggestion by Ralf Gommers is "linalg". However skbio puts the code
inside the "spatial" submodule.

This is the GitHub issue where this was initially discussed:


Julian Taylor | 15 Jul 20:06 2014

__numpy_ufunc__ and 1.9 release

as you may know we want to release numpy 1.9 soon. We should have solved
most indexing regressions the first beta showed.

The remaining blockers are finishing the new __numpy_ufunc__ feature.
This feature should allow for alternative method to overriding the
behavior of ufuncs from subclasses.
It is described here:

The current blocker issues are:

I'm not to familiar with all the complications of subclassing so I can't
really say how hard this is to solve.
My issue is that it there still seems to be debate on how to handle
operator overriding correctly and I am opposed to releasing a numpy with
yet another experimental feature that may or may not be finished
sometime later. Having datetime in infinite experimental state is bad
I think nobody is served well if we release 1.9 with the feature
prematurely based on a not representative set of users and the later
after more users showed up see we have to change its behavior.

So I'm wondering if we should delay the introduction of this feature to
1.10 or is it important enough to wait until there is a consensus on the
remaining issues?
Sai Rajeshwar | 16 Jul 11:55 2014

building scipy with umfpack and amd

   im running  a code which uses scipy.signal.convolve and numpy.sum  extensively. I ran the code on two machines. One machine took very less time compared to other with same configuration,  i checked the scipy configuration in that machine. i found scipy in that is built with umfpack and amd..

is this the reason behind it..  in what way umfpack and amd aid scipy operations..?


    libraries = ['blas']
    library_dirs = ['/usr/lib64']
    language = f77

    libraries = ['amd']
    library_dirs = ['/usr/lib64']
    define_macros = [('SCIPY_AMD_H', None)]
    swig_opts = ['-I/usr/include/suitesparse']
    include_dirs = ['/usr/include/suitesparse']

    libraries = ['lapack']
    library_dirs = ['/usr/lib64']
    language = f77


    libraries = ['blas']
    library_dirs = ['/usr/lib64']
    language = f77
    define_macros = [('NO_ATLAS_INFO', 1)]


    libraries = ['umfpack', 'amd']
    library_dirs = ['/usr/lib64']
    define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)]
    swig_opts = ['-I/usr/include/suitesparse', '-I/usr/include/suitesparse']
    include_dirs = ['/usr/include/suitesparse']

thanks a lot for your replies in advance

with regards..

M. Sai Rajeswar
M-tech  Computer Technology
IIT Delhi
----------------------------------Cogito Ergo Sum---------
SciPy-Dev mailing list
SciPy-Dev <at>
Sai Rajeshwar | 10 Jul 12:08 2014

scipy improve performance by parallelizing

hi all,

   im trying to optimise a python code takes huge amount of time on scipy functions such as scipy.signa.conv. Following are some of my queries regarding the same.. It would be great to hear from you..  thanks..
  1) Can Scipy take advantage of multi-cores.. if so how
2)what are ways we can improve the performance of scipy/numpy functions eg: using openmp, mpi etc
3)If scipy internally use blas/mkl libraries can we enable parallelism through these?

looks like i have to work on internals of scipy.. thanks a lot..

with regards..

M. Sai Rajeswar
M-tech  Computer Technology
IIT Delhi
----------------------------------Cogito Ergo Sum---------
SciPy-Dev mailing list
SciPy-Dev <at>
Alexandre Gramfort | 4 Jul 13:44 2014

griddata equivalent of matlab "V4" option


scipy does not seem to implement the griddata V4 option from matlab.

The original reference is :

David T. Sandwell, Biharmonic spline interpolation of GEOS-3 and
SEASAT altimeter data, Geophysical Research Letters, 2, 139-142, 1987.

is there a reason for it?

should I look for it somewhere else?

Dayvid Victor | 2 Jul 13:51 2014

scikit package naming/submission


I am creating a new package which, for now, is out of scope for sklearn. So I created scikit-protopy (current link:

So far I have seen the following use:
  • from scikits.protopy import *
  • from skprotopy import *
But I'm currently using:
  • from protopy import *
Is it ok? Will it be included in the scikits ( when submited to pypi?

Dayvid Victor R. de Oliveira
PhD Candidate in Computer Science at Federal University of Pernambuco (UFPE)
MSc in Computer Science at Federal University of Pernambuco (UFPE)
BSc in Computer Engineering - Federal University of Pernambuco (UFPE)
SciPy-Dev mailing list
SciPy-Dev <at>
Matt Newville | 1 Jul 21:14 2014

Re: SciPy-Dev Digest, Vol 129, Issue 1

> Date: Tue, 1 Jul 2014 00:05:43 +0000 (UTC)
> From: Sturla Molden <sturla.molden <at>>
> Subject: Re: [SciPy-Dev] SciPy-Dev Digest, Vol 128, Issue 5
> To: scipy-dev <at>
> Message-ID:
>         < <at>>
> Content-Type: text/plain; charset=UTF-8
> Matt Newville <newville <at>> wrote:
> > I don't disagree that scipy could use more pure optimizers, but I also
> > think that striving for a more consistent and elegant interface to these
> > would be very helpful.  With the notable exception of the relatively recent
> > unification of the scaler minimizers with minimize(), it seems that many of
> > the existing methods are fairly bare-bones wrappings of underlying C or
> > Fortran code.   Of course, having such wrapping is critically important,
> > but I think there is a need for a higher level interface as well.
> The raison d'etre for SciPy is "nice to use". So clearly simple and
> intuitive high-level interfaces are needed. If we only cared about speed we
> should all be coding in Fortran 77. Personally I am willing to scrifice a
> lot of speed for a nice high-level interface.

The solvers in scipy.optimize use interfaces that are clearly inherited directly from the Fortran, with very little change, even in the use of short argument names.    In some ways it make it easy for old-timers who see leastsq() as a shallow wrapping of MINPACKs lmdif/lmder.    It's "nice" in that the objective function is written in Python,  But the interfaces to these functions themselves is not very Pythonic.

> Currently my main interest in SciPy's LM is the underlying solver, though.
> It's a very old Fortran code that even supplies its own linear algebra
> solvers because it was written before LAPACK. It's not very nice on modern
> computers, for various reasons.

Could you elaborate? What you see as the main problems and the reasons that this is not very nice on modern computers?

--Matt Newville
SciPy-Dev mailing list
SciPy-Dev <at>
Fernando Perez | 29 Jun 04:31 2014

BoF at Scipy'14 about Numpy's future.

Hi folks,

[ Please excuse the repost, this is just to try to get as many good ideas as possible in the agenda]

I've just created a page on the numpy wiki:

I'd appreciate it if people could put down on that page the title of topics, plus a brief summary (with links to this or other relevant threads), they'd like to see discussed.

BoFs can be very useful but they can also easily devolve into random digressions. I'll do my best to fulfill my role as discussion facilitator by pushing for an agenda for the discussion that can fit realistically in the time we have, and trying to keep us on track.

It would help a lot if people put their ideas in advance, so we can sort/refine/prioritize a little bit. Hopefully this will lead to a more productive conversation.



Fernando Perez ( <at> fperez_org; mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail
SciPy-Dev mailing list
SciPy-Dev <at>