Allan Haldane | 15 Jan 00:08 2015
Picon

proposed change to recarray access

Hello all,

I've submitted a pull request on github which changes how string values
in recarrays are returned, which may break old code.

https://github.com/numpy/numpy/pull/5454
See also: https://github.com/numpy/numpy/issues/3993

Previously, recarray fields of type 'S' or 'U' (ie, strings) would be
returned as chararrays when accessed by attribute, but ndarrays when
accessed by indexing:

    >>> arr = np.array([('abc ', 1), ('abc', 2)],
                       dtype=[('str', 'S4'), ('id', int)])
    >>> arr = arr.view(np.recarray)
    >>> type(arr.str)
        numpy.core.defchararray.chararray
    >>> type(arr['str'])
        numpy.core.records.recarray

Chararray is deprecated, and furthermore this led to bugs in my code
since chararrays trim trailing whitespace but but ndarrays do not (and I
was not aware of conversion to chararray). For example:

    >>> arr.str[0] == arr.str[1]
    True
    >>> arr['str'][0] == arr['str'][1]
    False

In the pull request I have changed recarray attribute access so ndarrays
(Continue reading)

Jaime Fernández del Río | 13 Jan 16:15 2015
Picon

linspace handling of extra return

While working on something else, I realized that linspace is not handling requests for returning the sampling spacing consistently:

>>> np.linspace(0, 1, 3, retstep=True)
(array([ 0. ,  0.5,  1. ]), 0.5)
>>> np.linspace(0, 1, 1, retstep=True)
array([ 0.])
>>> np.linspace(0, 1, 0, retstep=True)
array([], dtype=float64)

Basically, retstep is ignored if the number of samples is 0 or 1. One could argue that it makes sense, because those sequences do not have a spacing defined. But at the very least it should be documented as doing so, and the following inconsistency removed:

>>> np.linspace(0, 1, 1, endpoint=True, retstep=True)
array([ 0.])
>>> np.linspace(0, 1, 1, endpoint=False, retstep=True)
(array([ 0.]), 1.0)

I am personally inclined to think that if a step is requested, then a step should be returned, and if it cannot be calculated in a reasonable manner, then a placeholder such as None, nan, 0 or stop - start should be returned.

What does the collective wisdom think is the best approach for this?

Jaime

--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Alexander Belopolsky | 12 Jan 19:33 2015
Picon

Equality of dtypes does not imply equality of type kinds

Consider this (on a 64-bit platform):

>>> numpy.dtype('q') == numpy.dtype('l')
True

but

>>> numpy.dtype('q').char == numpy.dtype('l').char
False

Is that intended?  Shouldn't dtype constructor "normalize" 'l' to 'q' (or 'i')?
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Pauli Virtanen | 11 Jan 18:50 2015
Picon
Picon

ANN: Scipy 0.15.0 release


Dear all,

We are pleased to announce the Scipy 0.15.0 release.

The 0.15.0 release contains bugfixes and new features, most important
of which are mentioned in the excerpt from the release notes below.

Source tarballs, binaries, and full release notes are available at
https://sourceforge.net/projects/scipy/files/scipy/0.15.0/

Best regards,
Pauli Virtanen

==========================
SciPy 0.15.0 Release Notes
==========================

SciPy 0.15.0 is the culmination of 6 months of hard work. It contains
several new features, numerous bug-fixes, improved test coverage and
better documentation.  There have been a number of deprecations and
API changes in this release, which are documented below.  All users
are encouraged to upgrade to this release, as there are a large number
of bug-fixes and optimizations.  Moreover, our development attention
will now shift to bug-fix releases on the 0.16.x branch, and on adding
new features on the master branch.

This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.5.1 or
greater.

New features
============

Linear Programming Interface
----------------------------

The new function `scipy.optimize.linprog` provides a generic
linear programming similar to the way `scipy.optimize.minimize`
provides a generic interface to nonlinear programming optimizers.
Currently the only method supported is *simplex* which provides
a two-phase, dense-matrix-based simplex algorithm. Callbacks
functions are supported, allowing the user to monitor the progress
of the algorithm.

Differential evolution, a global optimizer
------------------------------------------

A new `scipy.optimize.differential_evolution` function has been added
to the
``optimize`` module.  Differential Evolution is an algorithm used for
finding
the global minimum of multivariate functions. It is stochastic in
nature (does
not use gradient methods), and can search large areas of candidate
space, but
often requires larger numbers of function evaluations than conventional
gradient based techniques.

``scipy.signal`` improvements
-----------------------------

The function `scipy.signal.max_len_seq` was added, which computes a
Maximum
Length Sequence (MLS) signal.

``scipy.integrate`` improvements
--------------------------------

It is now possible to use `scipy.integrate` routines to integrate
multivariate ctypes functions, thus avoiding callbacks to Python and
providing better performance.

``scipy.linalg`` improvements
-----------------------------

The function `scipy.linalg.orthogonal_procrustes` for solving the
procrustes
linear algebra problem was added.

BLAS level 2 functions ``her``, ``syr``, ``her2`` and ``syr2`` are now
wrapped
in ``scipy.linalg``.

``scipy.sparse`` improvements
-----------------------------

`scipy.sparse.linalg.svds` can now take a ``LinearOperator`` as its
main input.

``scipy.special`` improvements
------------------------------

Values of ellipsoidal harmonic (i.e. Lame) functions and associated
normalization constants can be now computed using ``ellip_harm``,
``ellip_harm_2``, and ``ellip_normal``.

New convenience functions ``entr``, ``rel_entr`` ``kl_div``,
``huber``, and ``pseudo_huber`` were added.

``scipy.sparse.csgraph`` improvements
-------------------------------------

Routines ``reverse_cuthill_mckee`` and ``maximum_bipartite_matching``
for computing reorderings of sparse graphs were added.

``scipy.stats`` improvements
----------------------------

Added a Dirichlet multivariate distribution, `scipy.stats.dirichlet`.

The new function `scipy.stats.median_test` computes Mood's median test.

The new function `scipy.stats.combine_pvalues` implements Fisher's
and Stouffer's methods for combining p-values.

`scipy.stats.describe` returns a namedtuple rather than a tuple, allowing
users to access results by index or by name.

Deprecated features
===================

The `scipy.weave` module is deprecated.  It was the only module never
ported
to Python 3.x, and is not recommended to be used for new code - use Cython
instead.  In order to support existing code, ``scipy.weave`` has been
packaged
separately: https://github.com/scipy/weave.  It is a pure Python
package, and
can easily be installed with ``pip install weave``.

`scipy.special.bessel_diff_formula` is deprecated.  It is a private
function,
and therefore will be removed from the public API in a following release.

``scipy.stats.nanmean``, ``nanmedian`` and ``nanstd`` functions are
deprecated
in favor of their numpy equivalents.

Backwards incompatible changes
==============================

scipy.ndimage
-------------

The functions `scipy.ndimage.minimum_positions`,
`scipy.ndimage.maximum_positions`` and `scipy.ndimage.extrema` return
positions as ints instead of floats.

scipy.integrate
---------------

The format of banded Jacobians in `scipy.integrate.ode` solvers is
changed. Note that the previous documentation of this feature was
erroneous.
Nikolay Mayorov | 8 Jan 19:31 2015
Picon

Build doesn't pass tests

Hi all! 

I'm trying to build numpy on Windows 64 bit, Python 3.4.2 64 bit.  

I do environment setup by the following command:

CMD /K "SET MSSdk=1 && SET DISTUTILS_USE_SDK=1 && "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" x86_amd64"

Then I cd to the newly cloned numpy folder and do: python setup.py build_ext --inplace 

It looks like the build process finishes correctly. 

But then python -c "import numpy; numpy.test()" crashes the interpreter (some tests pass before the crash). I found out that it is caused by numpy.fromfile function call.

What might be the reason of that? Do I use wrong msvc compiler?
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Colin J. Williams | 5 Jan 19:40 2015
Picon

Characteristic of a Matrix.

One of the essential characteristics of a matrix is that it be rectangular.

This is neither spelt out or checked currently.

The Doc description refers to a class:
  • Returns a matrix from an array-like object, or from a string of data. A matrix is a                specialized 2-D array that retains its 2-D nature through operations. It has certain special operators, such as * (matrix multiplication) and ** (matrix power).
  • This illustrates a failure, which is reported later in the calculation:

    A2= np.matrix([[1, 2, -2], [-3, -1, 4], [4, 2 -6]])

    Here 2 - 6 is treated as an expression. 

    Wikipedia offers:

    In mathematics, a matrix (plural matrices) is a rectangular array[1] of numbers, symbols, or expressions, arranged in rows and columns.[2][3] The individual items in a matrix are called its elements or entries. An example of a matrix with 2 rows and 3 columns is

    In the Numpy context, the symbols or expressions need to be evaluable.

    Colin W.




    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion <at> scipy.org
    http://mail.scipy.org/mailman/listinfo/numpy-discussion
    
    Antony Lee | 5 Jan 08:34 2015
    Picon

    edge-cases of ellipsis indexing

    While trying to reproduce various fancy indexings for astropy's FITS sections (a loaded-on-demand array), I found the following interesting behavior:

    >>> np.array([1])[..., 0]
    array(1)
    >>> np.array([1])[0]
    1
    >>> np.array([1])[(0,)]
    1

    The docs say "Ellipsis expand to the number of : objects needed to make a selection tuple of the same length as x.ndim.", so it's not totally clear to me how to explain that difference in the results.

    Antony
    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion <at> scipy.org
    http://mail.scipy.org/mailman/listinfo/numpy-discussion
    
    Valentin Haenel | 4 Jan 21:59 2015
    Picon

    [ANN] bcolz 0.7.3

    
    ======================
    Announcing bcolz 0.7.3
    ======================
    
    What's new
    ==========
    
    This release includes the support for pickling persistent carray/ctable
    objects contributed by Matthew Rocklin. Also, the included version of
    Blosc is updated to ``v1.5.2``. Lastly, several minor issues and typos
    have been fixed, please see the release notes for details.
    
    ``bcolz`` is a renaming of the ``carray`` project.  The new goals for
    the project are to create simple, yet flexible compressed containers,
    that can live either on-disk or in-memory, and with some
    high-performance iterators (like `iter()`, `where()`) for querying them.
    
    Together, bcolz and the Blosc compressor, are finally fulfilling the
    promise of accelerating memory I/O, at least for some real scenarios:
    
    http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots
    
    For more detailed info, see the release notes in:
    https://github.com/Blosc/bcolz/wiki/Release-Notes
    
    What it is
    ==========
    
    bcolz provides columnar and compressed data containers.  Column storage
    allows for efficiently querying tables with a large number of columns.
    It also allows for cheap addition and removal of column.  In addition,
    bcolz objects are compressed by default for reducing memory/disk I/O
    needs.  The compression process is carried out internally by Blosc, a
    high-performance compressor that is optimized for binary data.
    
    bcolz can use numexpr internally so as to accelerate many vector and
    query operations (although it can use pure NumPy for doing so too).
    numexpr optimizes the memory usage and use several cores for doing the
    computations, so it is blazing fast.  Moreover, the carray/ctable
    containers can be disk-based, and it is possible to use them for
    seamlessly performing out-of-memory computations.
    
    bcolz has minimal dependencies (NumPy), comes with an exhaustive test
    suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
    typically tested on both UNIX and Windows operating systems.
    
    Installing
    ==========
    
    bcolz is in the PyPI repository, so installing it is easy::
    
        $ pip install -U bcolz
    
    Resources
    =========
    
    Visit the main bcolz site repository at:
    http://github.com/Blosc/bcolz
    
    Manual:
    http://bcolz.blosc.org
    
    Home of Blosc compressor:
    http://blosc.org
    
    User's mail list:
    bcolz <at> googlegroups.com
    http://groups.google.com/group/bcolz
    
    License is the new BSD:
    https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt
    
    ----
    
      **Enjoy data!**
    
    Maniteja Nandana | 3 Jan 22:44 2015
    Picon

    Regarding np.ma.masked_equal behavior

    Hello friends,

    This is an issue related to the working of masked_equal method. I was thinking if anyone related to an old ticket #1851, regarding the modification of masked_equal function effect on fill_value could clarify the situation, since right now, the documentation and implementation conflict. There is an issue raised regarding this #5408.

    Cheers,
    N.Maniteja
    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion <at> scipy.org
    http://mail.scipy.org/mailman/listinfo/numpy-discussion
    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion <at> scipy.org
    http://mail.scipy.org/mailman/listinfo/numpy-discussion
    
    Sturla Molden | 3 Jan 19:15 2015
    Picon

    Correct C string handling in the NumPy C API?

    Here is an example:
    
    NPY_NO_EXPORT NpyIter_IterNextFunc *
    NpyIter_GetIterNext(NpyIter *iter, char **errmsg)
    {
         npy_uint32 itflags = NIT_ITFLAGS(iter);
         int ndim = NIT_NDIM(iter);
         int nop = NIT_NOP(iter);
    
         if (NIT_ITERSIZE(iter) < 0) {
             if (errmsg == NULL) {
                 PyErr_SetString(PyExc_ValueError, "iterator is too large");
             }
             else {
                 *errmsg = "iterator is too large";
             }
             return NULL;
         }
    
    After NpyIter_GetIterNext returns, *errmsg points to a local variable in 
    a returned function.
    
    Either I am wrong about C, or this code has undefied behavior...
    
    My gutfeeling is that
    
        *errmsg = "iterator is too large";
    
    puts the string "iterator is too large" on the stack and points *errmsg 
    to the string.
    
    Shouldn't this really be
    
        strcpy(*errmsg, "iterator is too large");
    
    and then *errmsg should point to a char buffer allocated before 
    NpyIter_GetIterNext is called?
    
    Or will the statement
    
        *errmsg = "iterator is too large";
    
    put the string on the stack in the calling C function?
    
    Before I open an issue I will ask if my understanding of C is correct or 
    not.
    
    I am a bit confused here...
    
    Regards,
    Sturla
    
    Daniel Smith | 3 Jan 03:45 2015

    Optimizing multi-tensor contractions in numpy

    Hello everyone,

    I have been working on a chunk of code that basically sets out to provide a single function that can take an arbitrary einsum expression and computes it in the most optimal way. While np.einsum can compute arbitrary expressions, there are two drawbacks to using pure einsum: einsum does not consider building intermediate arrays for possible reductions in overall rank and is not currently capable of using a vendor BLAS. I have been working on a project that aims to solve both issues simultaneously:


    This program first builds the optimal way to contract the tensors together, or using my own nomenclature a “path.” This path is then iterated over and uses tensordot when possible and einsum for everything else. In test cases the worst case scenario adds a 20 microsecond overhead penalty and, in the best case scenario, it can reduce the overall rank of the tensor. The primary (if somewhat exaggerated) example is a 5-term N^8 index transformation that can be reduced to N^5; even when N is very small (N=10) there is a 2,000 fold speed increase over pure einsum or, if using tensordot, a 2,400 fold speed increase. This is somewhat similar to the new np.linalg.multi_dot function.

    If you are interested in this function please head over to the github repo and check it out. I believe the README is starting to become self-explanatory, but feel free to email me with any questions. 

    This originally started because I was looking into using numpy to rapidly prototype quantum chemistry codes. The results of which can be found here:

    As such, I am very interested in implementing this into numpy. While I think opt_einsum is in a pretty good place, there is still quite a bit to do (see outstanding issues in the README). Even if this is not something that would fit into numpy I would still be very interested in your comments.

    Thank you for your time,
    -Daniel Smith
    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion <at> scipy.org
    http://mail.scipy.org/mailman/listinfo/numpy-discussion
    

    Gmane