Ralf Gommers | 1 Jun 02:19
Gravatar

resetting set_string_function

Hi,

There seems to be a bug in set_string_function when resetting the formatting function to the default. After doing that the dtype of the array that is printed is the character string, not the numpy type. Example:

In [1]: a=arange(10, dtype=uint16)

In [2]: a
Out[2]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint16)

In [3]: set_string_function(lambda x: str(x*2))

In [4]: a
Out[4]: [ 0  2  4  6  8 10 12 14 16 18]

In [5]: set_string_function(None)   # reset to default

In [6]: a
Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'H')


The functionality to reset to default was introduced here: http://projects.scipy.org/numpy/ticket/351
Should I open a trac ticket, or am I missing something here?

Cheers,
Ralf

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Thomas Robitaille | 1 Jun 02:39
Picon
Gravatar

Rasterizing points onto an array

Hi,

I have a set of n points with real coordinates between 0 and 1, given  
by two numpy arrays x and y, with a value at each point represented by  
a third array z. I am trying to then rasterize the points onto a grid  
of size npix*npix. So I can start by converting x and y to integer  
pixel coordinates ix and iy. But my question is, is there an efficient  
way to add z[i] to the pixel given by (xi[i],yi[i])? Below is what I  
am doing at the moment, but the for loop becomes very inefficient for  
large n. I would imagine that there is a way to do this without using  
a loop?

---

import numpy as np

n = 10000000
x = np.random.random(n)
y = np.random.random(n)
z = np.random.random(n)

npix = 100
ix = np.array(x*float(npix),int)
iy = np.array(y*float(npix),int)

image = np.zeros((npix,npix))
for i in range(len(ix)):
     image[ix[i],iy[i]] = image[ix[i],iy[i]] + z[i]

---

Thanks for any advice,

Thomas
David Cournapeau | 1 Jun 03:18
Picon
Picon

Re: Problem with correlate

Charles R Harris wrote:
>
>
> On Sun, May 31, 2009 at 11:54 AM, rob steed <rjsteed <at> talk21.com
> <mailto:rjsteed <at> talk21.com>> wrote:
>
>
>     Hi,
>     After my previous email, I have opened a ticket #1117 (correlate
>     not order dependent)
>
>     I have found that the correlate function is defined in
>     multiarraymodule.c and
>     that inputs are being swapped using the following code
>
>        n1 = ap1->dimensions[0];
>        n2 = ap2->dimensions[0];
>        if (n1 < n2) {
>            ret = ap1;
>            ap1 = ap2;
>            ap2 = ret;
>            ret = NULL;
>            i = n1;
>            n1 = n2;
>            n2 = i;
>        }
>
>     I do not know the code well enough to see whether this could just
>     be removed (I don't know c either).
>     Maybe the algorithmn requires the inputs to be length ordered? I
>     will try to work it out.
>
>
> If the correlation algorithm doesn't use an fft and is done
> explicitly, then the maximum overlap for any shift is the length of
> the shortest input.  Swapping the arrays makes that logic easier to
> implement, but it isn't necessary.

But this logic is also wrong if the swapping is not taken into account -
as the OP mentioned, correlate(a, b) is not equal to correlate(b, a) in
the general case. The output is reversed in the second case compared to
the first case.

cheers,

David
Charles R Harris | 1 Jun 03:45
Picon

Re: Problem with correlate



On Sun, May 31, 2009 at 7:18 PM, David Cournapeau <david <at> ar.media.kyoto-u.ac.jp> wrote:
Charles R Harris wrote:
>
>
> On Sun, May 31, 2009 at 11:54 AM, rob steed <rjsteed <at> talk21.com
> <mailto:rjsteed <at> talk21.com>> wrote:
>
>
>     Hi,
>     After my previous email, I have opened a ticket #1117 (correlate
>     not order dependent)
>
>     I have found that the correlate function is defined in
>     multiarraymodule.c and
>     that inputs are being swapped using the following code
>
>        n1 = ap1->dimensions[0];
>        n2 = ap2->dimensions[0];
>        if (n1 < n2) {
>            ret = ap1;
>            ap1 = ap2;
>            ap2 = ret;
>            ret = NULL;
>            i = n1;
>            n1 = n2;
>            n2 = i;
>        }
>
>     I do not know the code well enough to see whether this could just
>     be removed (I don't know c either).
>     Maybe the algorithmn requires the inputs to be length ordered? I
>     will try to work it out.
>
>
> If the correlation algorithm doesn't use an fft and is done
> explicitly, then the maximum overlap for any shift is the length of
> the shortest input.  Swapping the arrays makes that logic easier to
> implement, but it isn't necessary.

But this logic is also wrong if the swapping is not taken into account -
as the OP mentioned, correlate(a, b) is not equal to correlate(b, a) in
the general case. The output is reversed in the second case compared to
the first case.

I didn't say it was *correctly* implemented ;)

Chuck


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
David Cournapeau | 1 Jun 05:08
Picon
Picon

Re: Problem with correlate

Charles R Harris wrote:
>
>
> On Sun, May 31, 2009 at 7:18 PM, David Cournapeau
> <david <at> ar.media.kyoto-u.ac.jp <mailto:david <at> ar.media.kyoto-u.ac.jp>>
> wrote:
>
>     Charles R Harris wrote:
>     >
>     >
>     > On Sun, May 31, 2009 at 11:54 AM, rob steed <rjsteed <at> talk21.com
>     <mailto:rjsteed <at> talk21.com>
>     > <mailto:rjsteed <at> talk21.com <mailto:rjsteed <at> talk21.com>>> wrote:
>     >
>     >
>     >     Hi,
>     >     After my previous email, I have opened a ticket #1117 (correlate
>     >     not order dependent)
>     >
>     >     I have found that the correlate function is defined in
>     >     multiarraymodule.c and
>     >     that inputs are being swapped using the following code
>     >
>     >        n1 = ap1->dimensions[0];
>     >        n2 = ap2->dimensions[0];
>     >        if (n1 < n2) {
>     >            ret = ap1;
>     >            ap1 = ap2;
>     >            ap2 = ret;
>     >            ret = NULL;
>     >            i = n1;
>     >            n1 = n2;
>     >            n2 = i;
>     >        }
>     >
>     >     I do not know the code well enough to see whether this could
>     just
>     >     be removed (I don't know c either).
>     >     Maybe the algorithmn requires the inputs to be length ordered? I
>     >     will try to work it out.
>     >
>     >
>     > If the correlation algorithm doesn't use an fft and is done
>     > explicitly, then the maximum overlap for any shift is the length of
>     > the shortest input.  Swapping the arrays makes that logic easier to
>     > implement, but it isn't necessary.
>
>     But this logic is also wrong if the swapping is not taken into
>     account -
>     as the OP mentioned, correlate(a, b) is not equal to correlate(b,
>     a) in
>     the general case. The output is reversed in the second case
>     compared to
>     the first case.
>
>
> I didn't say it was *correctly* implemented ;)

:) So I gave it a shot

http://github.com/cournape/numpy/commits/fix_correlate

(It took me a while to realize that PyArray_ISFLEXIBLE returns false for
array object. Is this expected ? The documentation concerning copyswap
says that it is necessary for flexible arrays, but I think it is
necessary  for object arrays as well).

It still bothers me that correlate does not conjugate the second
argument for complex arrays...

cheers,

David
Nathan Bell | 1 Jun 06:26
Picon
Gravatar

Re: Rasterizing points onto an array

On Sun, May 31, 2009 at 8:39 PM, Thomas Robitaille
<thomas.robitaille <at> gmail.com> wrote:
> Hi,
>
> I have a set of n points with real coordinates between 0 and 1, given
> by two numpy arrays x and y, with a value at each point represented by
> a third array z. I am trying to then rasterize the points onto a grid
> of size npix*npix. So I can start by converting x and y to integer
> pixel coordinates ix and iy. But my question is, is there an efficient
> way to add z[i] to the pixel given by (xi[i],yi[i])? Below is what I
> am doing at the moment, but the for loop becomes very inefficient for
> large n. I would imagine that there is a way to do this without using
> a loop?
>
> ---
>
> import numpy as np
>
> n = 10000000
> x = np.random.random(n)
> y = np.random.random(n)
> z = np.random.random(n)
>
> npix = 100
> ix = np.array(x*float(npix),int)
> iy = np.array(y*float(npix),int)
>
> image = np.zeros((npix,npix))
> for i in range(len(ix)):
>     image[ix[i],iy[i]] = image[ix[i],iy[i]] + z[i]
>

There might be a better way, but histogram2d() seems like a good fit:

import numpy as np

n = 1000000
x = np.random.random(n)
y = np.random.random(n)
z = np.random.random(n)

npix = 100

bins = np.linspace(0, 1.0, npix + 1)
image = np.histogram2d(x, y, bins=bins, weights=z)[0]

--

-- 
Nathan Bell wnbell <at> gmail.com
http://www.wnbell.com/
Charles R Harris | 1 Jun 06:48
Picon

Re: Problem with correlate



On Sun, May 31, 2009 at 9:08 PM, David Cournapeau <david <at> ar.media.kyoto-u.ac.jp> wrote:
Charles R Harris wrote:
>
>
> On Sun, May 31, 2009 at 7:18 PM, David Cournapeau
> <david <at> ar.media.kyoto-u.ac.jp <mailto:david <at> ar.media.kyoto-u.ac.jp>>
> wrote:
>
>     Charles R Harris wrote:
>     >
>     >
>     > On Sun, May 31, 2009 at 11:54 AM, rob steed <rjsteed <at> talk21.com
>     <mailto:rjsteed <at> talk21.com>
>     > <mailto:rjsteed <at> talk21.com <mailto:rjsteed <at> talk21.com>>> wrote:
>     >
>     >
>     >     Hi,
>     >     After my previous email, I have opened a ticket #1117 (correlate
>     >     not order dependent)
>     >
>     >     I have found that the correlate function is defined in
>     >     multiarraymodule.c and
>     >     that inputs are being swapped using the following code
>     >
>     >        n1 = ap1->dimensions[0];
>     >        n2 = ap2->dimensions[0];
>     >        if (n1 < n2) {
>     >            ret = ap1;
>     >            ap1 = ap2;
>     >            ap2 = ret;
>     >            ret = NULL;
>     >            i = n1;
>     >            n1 = n2;
>     >            n2 = i;
>     >        }
>     >
>     >     I do not know the code well enough to see whether this could
>     just
>     >     be removed (I don't know c either).
>     >     Maybe the algorithmn requires the inputs to be length ordered? I
>     >     will try to work it out.
>     >
>     >
>     > If the correlation algorithm doesn't use an fft and is done
>     > explicitly, then the maximum overlap for any shift is the length of
>     > the shortest input.  Swapping the arrays makes that logic easier to
>     > implement, but it isn't necessary.
>
>     But this logic is also wrong if the swapping is not taken into
>     account -
>     as the OP mentioned, correlate(a, b) is not equal to correlate(b,
>     a) in
>     the general case. The output is reversed in the second case
>     compared to
>     the first case.
>
>
> I didn't say it was *correctly* implemented ;)

:) So I gave it a shot

http://github.com/cournape/numpy/commits/fix_correlate

(It took me a while to realize that PyArray_ISFLEXIBLE returns false for
array object. Is this expected ? The documentation concerning copyswap
says that it is necessary for flexible arrays, but I think it is
necessary  for object arrays as well).

Don't know. PyArray_ISFLEXIBLE looks like a macro... 

#define PyArray_ISFLEXIBLE(obj) PyTypeNum_ISFLEXIBLE(PyArray_TYPE(obj))

#define PyTypeNum_ISFLEXIBLE(type) (((type) >=NPY_STRING) &&  \
                                    ((type) <=NPY_VOID))

And the typecodes are '?bhilqpBHILQPfdgFDGSUVO'. So 'SUV' are flexible and O is not. I'm not clear on how correlate should apply to any of 'SUV' but it might be worth having it work for objects.


It still bothers me that correlate does not conjugate the second
argument for complex arrays...

It bothers me also... Chuck


_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
David Cournapeau | 1 Jun 07:05
Picon
Picon

Re: Problem with correlate

Charles R Harris wrote:
>
>
> On Sun, May 31, 2009 at 9:08 PM, David Cournapeau
> <david <at> ar.media.kyoto-u.ac.jp <mailto:david <at> ar.media.kyoto-u.ac.jp>>
> wrote:
>
>     Charles R Harris wrote:
>     >
>     >
>     > On Sun, May 31, 2009 at 7:18 PM, David Cournapeau
>     > <david <at> ar.media.kyoto-u.ac.jp
>     <mailto:david <at> ar.media.kyoto-u.ac.jp>
>     <mailto:david <at> ar.media.kyoto-u.ac.jp
>     <mailto:david <at> ar.media.kyoto-u.ac.jp>>>
>     > wrote:
>     >
>     >     Charles R Harris wrote:
>     >     >
>     >     >
>     >     > On Sun, May 31, 2009 at 11:54 AM, rob steed
>     <rjsteed <at> talk21.com <mailto:rjsteed <at> talk21.com>
>     >     <mailto:rjsteed <at> talk21.com <mailto:rjsteed <at> talk21.com>>
>     >     > <mailto:rjsteed <at> talk21.com <mailto:rjsteed <at> talk21.com>
>     <mailto:rjsteed <at> talk21.com <mailto:rjsteed <at> talk21.com>>>> wrote:
>     >     >
>     >     >
>     >     >     Hi,
>     >     >     After my previous email, I have opened a ticket #1117
>     (correlate
>     >     >     not order dependent)
>     >     >
>     >     >     I have found that the correlate function is defined in
>     >     >     multiarraymodule.c and
>     >     >     that inputs are being swapped using the following code
>     >     >
>     >     >        n1 = ap1->dimensions[0];
>     >     >        n2 = ap2->dimensions[0];
>     >     >        if (n1 < n2) {
>     >     >            ret = ap1;
>     >     >            ap1 = ap2;
>     >     >            ap2 = ret;
>     >     >            ret = NULL;
>     >     >            i = n1;
>     >     >            n1 = n2;
>     >     >            n2 = i;
>     >     >        }
>     >     >
>     >     >     I do not know the code well enough to see whether this
>     could
>     >     just
>     >     >     be removed (I don't know c either).
>     >     >     Maybe the algorithmn requires the inputs to be length
>     ordered? I
>     >     >     will try to work it out.
>     >     >
>     >     >
>     >     > If the correlation algorithm doesn't use an fft and is done
>     >     > explicitly, then the maximum overlap for any shift is the
>     length of
>     >     > the shortest input.  Swapping the arrays makes that logic
>     easier to
>     >     > implement, but it isn't necessary.
>     >
>     >     But this logic is also wrong if the swapping is not taken into
>     >     account -
>     >     as the OP mentioned, correlate(a, b) is not equal to
>     correlate(b,
>     >     a) in
>     >     the general case. The output is reversed in the second case
>     >     compared to
>     >     the first case.
>     >
>     >
>     > I didn't say it was *correctly* implemented ;)
>
>     :) So I gave it a shot
>
>     http://github.com/cournape/numpy/commits/fix_correlate
>
>     (It took me a while to realize that PyArray_ISFLEXIBLE returns
>     false for
>     array object. Is this expected ? The documentation concerning copyswap
>     says that it is necessary for flexible arrays, but I think it is
>     necessary  for object arrays as well).
>
>
> Don't know. PyArray_ISFLEXIBLE looks like a macro... 
>
> #define PyArray_ISFLEXIBLE(obj) PyTypeNum_ISFLEXIBLE(PyArray_TYPE(obj))
>
> #define PyTypeNum_ISFLEXIBLE(type) (((type) >=NPY_STRING) &&  \
>                                     ((type) <=NPY_VOID))
>
> And the typecodes are '?bhilqpBHILQPfdgFDGSUVO'. So 'SUV' are flexible
> and O is not.

I re-read the copyswap documentation, and realized I did not read it
correctly. Now, I am not sure when to use copyswap vs memcpy (memcpy
should be much faster on basic types, as memcpy should be inlined
generally, whereas I doubt copyswap can).

> I'm not clear on how correlate should apply to any of 'SUV' but it
> might be worth having it work for objects.

It already does (I added a couple of unit tests in the branch, since
there were no test for correlate, and one is for Decimal object arrays).

>
>
>     It still bothers me that correlate does not conjugate the second
>     argument for complex arrays...
>
>
> It bothers me also...

I think we should just fix it to use conjugate - I will do this in the
branch, and I will integrate it in the trunk later unless someone stands
up vehemently against the change. I opened up a ticket to track this,
though,

cheers,

David
Francesc Alted | 1 Jun 18:22

Single precision equivalents of missing C99 functions

Hi,

In the process of adding single precision support to Numexpr, I'm 
experimenting a divergence between Numexpr and NumPy computations.  It all 
boils down to the fact that my implementation defined single precision 
functions completely.  As for one, consider my version of expm1f:

inline static float expm1f(float x)
{
    float u = expf(x);
    if (u == 1.0) {
        return x;
    } else if (u-1.0 == -1.0) {
        return -1;
    } else {
        return (u-1.0) * x/logf(u);
    }
}

while NumPy seems to declare expm1f as:

static float expm1f(float x)
{
    return (float) expm1((double)x);
}

This leads to different results on Windows when computing expm1(x) for large 
values of x (like 99.), where my approach returns a 'nan', while NumPy returns 
an 'inf'.  Curiously, on Linux both approaches returns 'inf'.

I suppose that the NumPy crew already experimented this divergence and finally 
used the cast approach for computing the single precision functions.  However, 
this is effectively preventing the use of optimized functions for single 
precision (i.e. double precision 'exp' and 'log' are used instead of single 
precision specific 'expf' and 'logf'), which could perform potentially better.  
So, I'm wondering if it would not be better to use a native implementation 
instead.  Thoughts?

Thanks,

--

-- 
Francesc Alted
Robert Kern | 1 Jun 18:35
Picon
Gravatar

Re: Problem with correlate

On Mon, Jun 1, 2009 at 00:05, David Cournapeau
<david <at> ar.media.kyoto-u.ac.jp> wrote:

> I think we should just fix it to use conjugate - I will do this in the
> branch, and I will integrate it in the trunk later unless someone stands
> up vehemently against the change. I opened up a ticket to track this,
> though,

It breaks everyone's code that works around the current behavior.

--

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

Gmane