Andrea Gavana | 19 Aug 16:23 2014

Numpy/Fortran puzzle (?)

Hi All,

    I have the following (very ugly) line of code:

all_results = np.asarray([transm_hist[date_idx, :, idx_main_set[date_idx] ]*main_flow[date_idx, 0:n_fluids] for date_idx in xrange(n_dates)])

where transm_hist.shape = (n_dates, n_fluids, n_nodes), main_flow.shape = (n_dates, n_fluids) and idx_main_set is an array containing integer indices with idx_main_set.shape = (n_dates, ) . The resulting variable all_results.shape = (n_dates, n_fluids)

Since that line of code is relatively slow if done repeatedly, I thought I'd be smart to rewrite it in Fortran and then use f2py to wrap the subroutine. So I wrote this:

subroutine matmul(transm_hist, idx_main_set, main_flow, all_results, &
                  n_dates, n_fluids, n_nodes)

  implicit none

  integer ( kind = 4 ), intent(in) :: n_dates, n_fluids, n_nodes

  real    ( kind = 4 ), intent(in) :: transm_hist(n_dates, n_fluids, n_nodes)
  real    ( kind = 4 ), intent(in) :: main_flow(n_dates, n_fluids)
  integer ( kind = 4 ), intent(in) :: idx_main_set(n_dates)
  real    ( kind = 4 ), intent(out):: all_results(n_dates, n_fluids)

  integer (kind = 4) i, node
  do i = 1, n_dates
      node = int(idx_main_set(i))
      all_results(i, :) = transm_hist(i, 1:n_fluids, node)*main_flow(i, 1:n_fluids)

Unfortunately, it appears that I am not getting out quite the same results... I know it's a bit of a stretch with so little information, but does anyone have a suggestion on where the culprit might be? Maybe the elementwise multiplication is done differently in Numpy and Fortran, or I am misunderstanding what the np.asarray is doing with the list comprehension above?

I appreciate any suggestion, which can also be related to improvement in the code. Thank you in advance.


"Imagination Is The Only Weapon In The War Against Reality."

NumPy-Discussion mailing list
NumPy-Discussion <at>
Charles R Harris | 15 Aug 19:03 2014

Question about OpenBLAS affinity setting.

Hi All,

Seeking some discussion about setting the OpenBLAS affinity. Currently this is set when is imported:

    # disables openblas affinity setting of the main thread that limits
    # python threads or processes to one core
    if 'OPENBLAS_MAIN_FREE' not in os.environ:
        os.environ['OPENBLAS_MAIN_FREE'] = '1'
    if 'GOTOBLAS_MAIN_FREE' not in os.environ:
        os.environ['GOTOBLAS_MAIN_FREE'] = '1'
    from ._dotblas import dot, vdot, inner
except ImportError:

Note that the affinity is set whether or not the import of _dotblas fails, which it will if cblas was not found at buildtime. This all seems a bit hinky to me. If we are always going to set the affinity, it should be in the core/ file. If the setting is moved after the import, it will still happen if, say, ATLAS cblas is present. It seems to me that there should be a better, more transparent way to do this, especially as I'm in the process of moving some of the cblas down into multiarray. One option is to make this part of the multiarray module import, but without a bit of work, that would still depend only on cblas being detected during the build process rather than OpenBLAS. Also, is GotoBLAS still a viable option?


NumPy-Discussion mailing list
NumPy-Discussion <at>
Sebastian Berg | 15 Aug 15:46 2014

Weighted Covariance/correlation

Hi all,

Tom Poole has opened pull request to implement weights into
np.cov (correlation can be added), somewhat picking up the effort
started by Noel Dawe in

The pull request would currently implement an accuracy type `weights`
keyword argument as default, but have a switch `repeat_weights` to use
repeat type weights instead (frequency type are a special case of this I

As far as I can see, the code is in a state that it can be tested. But
since it is a new feature, the names/defaults are up for discussion, so
maybe someone who might use such a feature has a preference. I know we
had a short discussion about this before, but it was a while ago. For
example another option would be to have the two weights as two keyword
arguments, instead of a boolean switch.


Pierre-Andre Noel | 14 Aug 21:07 2014

Proposed new feature for numpy.einsum: repeated output subscripts as diagonal

(I created issue 4965 earlier today on this topic, and I have been 
advised to email to this mailing list to discuss whether it is a good 
idea or not. I include my original post as-is, followed by additional 

I think that the following new feature would make `numpy.einsum` even 
more powerful/useful/awesome than it already is. Moreover, the change 
should not interfere with existing code, it would preserve the 
"minimalistic" spirit of `numpy.einsum`, and the new functionality would 
integrate in a seamless/intuitive manner for the users.

In short, the new feature would allow for repeated subscripts to appear 
in the "output" part of the `subscripts` parameter (i.e., on the 
right-hand side of `->`). The corresponding dimensions in the resulting 
`ndarray` would only be filled along their diagonal, leaving the off 
diagonal entries to the default value for this `dtype` (typically zero). 
Note that the current behavior is to raise an exception when repeated 
output subscripts are being used.

This is simplest to describe with an example involving the dual behavior 
of `numpy.diag`.

# Extracting the diagonal of a 2-D array.
A = arange(16).reshape(4,4)
print(diag(A)) # Output: [ 0 5 10 15 ]
print(einsum('ii->i', A)) # Same as previous line (current behavior).

# Constructing a diagonal 2-D array.
v = arange(4)
print(diag(v)) # Output: [[0 0 0 0] [0 1 0 0] [0 0 2 0] [0 0 0 3]]
print(einsum('i->ii', v)) # New behavior would be same as previous line.
# The current behavior of the previous line is to raise an exception.

By opposition to `numpy.diag`, the approach generalizes to higher 
dimensions: `einsum('iii->i', A)` extracts the diagonal of a 3-D array, 
and `einsum('i->iii', v)` would build a diagonal 3-D array.

The proposed behavior really starts to shine in more intricate cases.

# Dummy values, these should be probabilities to make sense below.
P_w_ab = arange(24).reshape(3,2,4)
P_y_wxab = arange(144).reshape(3,3,2,2,4)

# With the proposed behavior, the following two lines should be equivalent.
P_xyz_ab = einsum('wab,xa,ywxab,zy->xyzab', P_w_ab, eye(2), P_y_wxab, 
also_P_xyz_ab = einsum('wab,ywaab->ayyab', P_w_ab, P_y_wxab)

If this is not convincing enough, replace `eye(2)` by 
`eye(P_w_ab.shape[1])` and replace `eye(3)` by `eye(P_y_wxab.shape[0])`, 
then imagine more dimensions and repeated indices... The new notation 
would allow for crisper codes and reduce the opportunities for dumb 

For those who wonder, the above computation amounts to 
$P(X=x,Y=y,Z=z|A=a,B=b) = \sum_w P(W=w|A=a,B=b) P(X=x|A=a) 
P(Y=y|W=w,X=x,A=a,B=b) P(Z=z|Y=y)$ with $P(X=x|A=a)=\delta_{xa}$ and 
$P(Z=z|Y=y)=\delta_{zy}$ (using LaTeX notation, and $\delta_{ij}$ is 
[Kronecker's delta](

(End of original post.)

I have been told by  <at> jaimefrio that "The best way of getting a new 
feature into numpy is putting it in yourself." Hence, if discussions 
here do reveal that this is a good idea, then I may give a try at coding 
it myself. However, I currently know nothing of the inner workings of 
numpy/ndarray/einsum, and I have higher priorities right now. This means 
that it could take a long while before I contribute any code, if I ever 
do. Hence, if anyone feels like doing it, feel free to do so!

Also, I am aware that storing a lot of zeros in an `ndarray` may not, a 
priori, be a desirable avenue. However, there are times where you have 
to do it: think of `numpy.eye` as an example. In my case of application, 
I use such diagonal structures in the initialization of an `ndarray` 
which is later updated through an iterative process. After these 
iterations, most of the zeros will be gone. Do other people see a use 
for such capabilities?

Thank you for your time and have a nice day.


Pierre-André Noël
NumPy-Discussion mailing list
NumPy-Discussion <at>
Matthew Brett | 12 Aug 21:24 2014

Fwd: We need help working on code coverage for Cython code


Sorry for those of you also on the scikit-image mailing list - but
here again I'm asking for help to get coverage working for Cython

Over on another mailing list, we've hit a big problem trying to work
out coverage on a large amount of Cython code.

As y'all probably know, there's no automated way of checking code
coverage on Cython code at the moment.

The Cython developers have done some work on this [1] but it is
currently stalled for lack of developer time to work on it.

We'd really like to get this working, and the Cython developers have
offered to help, to get this started.

Can anyone help us out by a) joining an interactive discussion for 15
minutes or so with the Cython developers to get us started b) helping
with a short burst of coding that will follow, we estimate a few days.

I think this is something many of us need, and it would also be a
thank you to the Cython team for their work, which we all use so much.



Warren Weckesser | 12 Aug 17:35 2014

New function `count_unique` to generate contingency tables.

I created a pull request ( that defines the function `count_unique`.  `count_unique` generates a contingency table from a collection of sequences.  For example,

In [7]: x = [1, 1, 1, 1, 2, 2, 2, 2, 2]

In [8]: y = [3, 4, 3, 3, 3, 4, 5, 5, 5]

In [9]: (xvals, yvals), counts = count_unique(x, y)

In [10]: xvals
Out[10]: array([1, 2])

In [11]: yvals
Out[11]: array([3, 4, 5])

In [12]: counts
array([[3, 1, 0],
       [1, 1, 3]])

It can be interpreted as a multi-argument generalization of `np.unique(x, return_counts=True)`.

It overlaps with Pandas' `crosstab`, but I think this is a pretty fundamental counting operation that fits in numpy.

Matlab's `crosstab` ( and R's `table` perform the same calculation (with a few more bells and whistles).

For comparison, here's Pandas' `crosstab` (same `x` and `y` as above):

In [28]: import pandas as pd

In [29]: xs = pd.Series(x)

In [30]: ys = pd.Series(y)

In [31]: pd.crosstab(xs, ys)
col_0  3  4  5
1      3  1  0
2      1  1  3

And here is R's `table`:

> x <- c(1,1,1,1,2,2,2,2,2)
> y <- c(3,4,3,3,3,4,5,5,5)
> table(x, y)
x   3 4 5
  1 3 1 0
  2 1 1 3

Is there any interest in adding this (or some variation of it) to numpy?


NumPy-Discussion mailing list
NumPy-Discussion <at>
Matti Picus | 11 Aug 23:46 2014

Re: NumPy-Discussion OpenBLAS and dotblas

Hi Nathaniel.
Thanks for your prompt reply. I think numpy is a wonderful project, and 
you all do a great job moving it forward.
If you ask what would my vision for maturing numpy, I would like to see 
a grouping of linalg matrix-operation functionality into a python level 
package, exactly the opposite of more tightly tying linalg into the core 
of numpy. The orthagonality would allow goups like PyOpenCL to reuse the 
matrix operations on data located off the CPU's RAM, just to give one 
example; and make it easier for non-numpy developers to create a 
complete replacement of lapack with other implementations. Much of the 
linalg package would of course be implemented in c or fortran, but the 
interface to ndarray would use the well-established idea of contiguous 
matrices with shapes, strides, and a single memory store, supporting 
only numeric number types.
I suggested cffi since it provides a convienent and efficient interface 
to ndarray. Thus python could remain as a thin wrapper over the calls 
out to c-based libraries much like lapack_lite does today, but at the 
python level rather that the capi level.
Yes, a python-based interface would slows the code down a bit, but I 
would argue that
1. the current state of lapack_litemodule.c and umath_linalg.c.src, with 
its myriad of compile-time macros and complex code paths, scares people 
away from contributing to the ongoing maintenance of the library while 
tying the code very closely to the lapack routines, and
2. matrices larger than 3x3 or so should be spending most of the 
computation time in the underlying lapack/blas library irregardless of 
whether the interface is python-based or capi-based.

On 10/08/2014 8:00 PM, numpy-discussion-request <at> wrote:
> Date: Sat, 9 Aug 2014 21:11:19 +0100
> From: Nathaniel Smith <njs <at>>
> Subject: Re: [Numpy-discussion] NumPy-Discussion OpenBLAS and dotblas
> To: Discussion of Numerical Python <numpy-discussion <at>>
> On Sat, Aug 9, 2014 at 8:35 PM, Matti Picus <matti.picus <at>> wrote:
>> Hi. I am working on numpy in pypy. It would be much more challenging for
>> me if you merged more code into the core of numpy,
> Hi Matti,
> I can definitely see how numpy changes cause trouble for you, and
> sympathize. But, can you elaborate on what kind of changes would make
> your life easier *that also* help make numpy proper better in their
> own right? Because unfortunately, I don't see how we can reasonably
> pass up on improvements to numpy if the only justification is to make
> numpypy's life easier. (I'd also love to see pypy become usable for
> general numerical work, but not only is it not there now, I don't see
> how numpypy will ultimately get us there even if we do help it along
> -- almost none of the ecosystem can get by numpy's python-level APIs
> alone.) But obviously if there are changes that are mutually
> beneficial, well then, that's a lot easier to justify :-)
> -n
Kurt Smith | 11 Aug 23:09 2014

ANN: DistArray 0.5 release

DistArray 0.5 release

**License:** Three-clause BSD

**Python versions:** 2.7, 3.3, and 3.4

**OS support:** \*nix and Mac OS X

What is DistArray?

DistArray aims to bring the ease-of-use of NumPy to data-parallel
high-performance computing.  It provides distributed multi-dimensional NumPy
arrays, distributed ufuncs, and distributed IO capabilities.  It can
efficiently interoperate with external distributed libraries like Trilinos.
DistArray works with NumPy and builds on top of it in a flexible and natural

0.5 Release

Noteworthy improvements in this release include:

* closer alignment with NumPy's API,
* support for Python 3.4 (existing support for Python 2.7 and 3.3),
* a performance-oriented MPI-only mode for deployment on clusters and
* a way to register user-defined functions to be callable locally on worker
* more consistent naming of sub-packages,
* testing with MPICH2 (already tested against OpenMPI),
* improved and expanded examples,
* installed version testable via ``distarray.test()``, and
* performance and scaling improvements.

With this release, DistArray ready for real-world testing and deployment.  The
project is still evolving rapidly and we appreciate the continued input from
the larger scientific-Python community.

Existing features


* supports NumPy-like slicing, reductions, and ufuncs on distributed
  multidimensional arrays;
* has a client-engine process design -- data resides on the worker processes,
  commands are initiated from master;
* allows full control over what is executed on the worker processes and
  integrates transparently with the master process;
* allows direct communication between workers, bypassing the master process
  for scalability;
* integrates with IPython.parallel for interactive creation and exploration of
  distributed data;
* supports distributed ufuncs (currently without broadcasting);
* builds on and leverages MPI via MPI4Py in a transparent and user-friendly
* has basic support for unstructured arrays;
* supports user-controllable array distributions across workers (block,
  cyclic, block-cyclic, and unstructured) on a per-axis basis;
* has a straightforward API to control how an array is distributed;
* has basic plotting support for visualization of array distributions;
* separates the array’s distribution from the array’s data -- useful for
  slicing, reductions, redistribution, broadcasting, and other operations;
* implements distributed random arrays;
* supports ``.npy``-like flat-file IO and hdf5 parallel IO (via ``h5py``);
  leverages MPI-based IO parallelism in an easy-to-use and transparent way;
* supports the distributed array protocol [protocol]_, which allows
  independently developed parallel libraries to share distributed arrays
  without copying, analogous to the PEP-3118 new buffer protocol.

Planned features and roadmap

Near-term features and improvements include:

* array re-distribution capabilities;
* lazy evaluation and deferred computation for latency hiding;
* interoperation with Trilinos [Trilinos]_; and
* distributed broadcasting support.

The longer-term roadmap includes:

* Integration with other packages [petsc]_ that subscribe to the distributed
  array protocol [protocol]_;
* Distributed fancy indexing;
* Out-of-core computations;
* Support for distributed sorting and other non-trivial distributed
  algorithms; and
* End-user control over communication and temporary array creation, and other
  performance aspects of distributed computations.

History and funding

Brian Granger started DistArray as a NASA-funded SBIR project in 2008.
Enthought picked it up as part of a DOE Phase II SBIR [SBIR]_ to provide a
generally useful distributed array package.  It builds on NumPy, MPI, MPI4Py,
IPython, IPython.parallel, and interfaces with the Trilinos suite of
distributed HPC solvers (via PyTrilinos [Trilinos]_).

This material is based upon work supported by the Department of Energy under
Award Number DE-SC0007699.

This report was prepared as an account of work sponsored by an agency of the
United States Government.  Neither the United States Government nor any agency
thereof, nor any of their employees, makes any warranty, express or implied,
or assumes any legal liability or responsibility for the accuracy,
completeness, or usefulness of any information, apparatus, product, or process
disclosed, or represents that its use would not infringe privately owned
rights.  Reference herein to any specific commercial product, process, or
service by trade name, trademark, manufacturer, or otherwise does not
necessarily constitute or imply its endorsement, recommendation, or favoring
by the United States Government or any agency thereof.  The views and opinions
of authors expressed herein do not necessarily state or reflect those of the
United States Government or any agency thereof.

.. [Trilinos]
NumPy-Discussion mailing list
NumPy-Discussion <at>
Matti Picus | 9 Aug 21:35 2014

Re: NumPy-Discussion OpenBLAS and dotblas

Hi. I am working on numpy in pypy. It would be much more challenging for 
me if you merged more code into the core of numpy, that means even more 
must be duplicated when using a different underlying ndarray 
If you are thinking of touching linalg/lapack_lite/blas_lite, I would 
prefer a simpler model that uses ndarray as a simple storage container, 
does not use cpython-specifc capi extentions with reference counting, 
and interfaces with python via cffi[0].


On 8/08/2014 8:00 PM, numpy-discussion-request <at> wrote:
> Message: 1
> Date: Thu, 7 Aug 2014 20:32:21 -0600
> From: Charles R Harris <charlesr.harris <at>>
> Subject: [Numpy-discussion] OpenBLAS and dotblas
> To: numpy-discussion <numpy-discussion <at>>
> Message-ID:
> 	<CAB6mnxJOON2tn2-YP8wsBwyqRhNOQEe715Tek6rGGOVTuazyFw <at>>
> Content-Type: text/plain; charset="utf-8"
> Hi All,
> It looks like numpy dot only uses BLAS if ATLAS is present, see
> numpy/core/ Has anyone done the mods needed to use OpenBLAS? What
> is the current status of using OpenBLAS with numpy?
> Also, I'm thinking of moving linalg/lapack_lite/blas_lite.c down into the
> core. Thoughts?
> Chuck
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
Julian Taylor | 9 Aug 14:38 2014

ANN: NumPy 1.8.2 bugfix release


I am pleased to announce the release of NumPy 1.8.2, a
pure bugfix release for the 1.8.x series.
The upgrade is recommended for all users of the 1.8.x series.

Following issues have been fixed:
* gh-4836: partition produces wrong results for multiple selections in
equal ranges
* gh-4656: Make fftpack._raw_fft threadsafe
* gh-4628: incorrect argument order to _copyto in in np.nanmax, np.nanmin
* gh-4642: Hold GIL for converting dtypes types with fields
* gh-4733: fix np.linalg.svd(b, compute_uv=False)
* gh-4853: avoid unaligned simd load on reductions on i386
* gh-4722: Fix seg fault converting empty string to object
* gh-4613: Fix lack of NULL check in array_richcompare
* gh-4774: avoid unaligned access for strided byteswap
* gh-650: Prevent division by zero when creating arrays from some buffers
* gh-4602: ifort has issues with optimization flag O2, use O1

The source distributions have been uploaded to PyPI. The Windows
installers, documentation and release notes can be found at:

Julian Taylor

NumPy-Discussion mailing list
NumPy-Discussion <at>
Matthew Brett | 9 Aug 03:41 2014

Help - numpy / scipy binary compatibility


I would be very happy of some help trying to work out a numpy package
binary incompatibility.

I'm trying to work out what's happening for this ticket:

which I summarized at the end:

but basically, we're getting these errors:

RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility

I now realize I am lost in the world of numpy / scipy etc binary
compatibility, I'd really like some advice.    In this case

numpy == 1.8.1
scipy == 0.14.0 - compiled against numpy 1.5.1
scikit-learn == 0.15.1 compiled against numpy 1.6.0

Can y'all see any potential problem with those dependencies in binary builds?

The relevant scipy Cython c files seem to guard against raising this
error by doing not-strict checks of the e.g. numpy dtype, so I am
confused how these errors come about.  Can anyone give any pointers?