Robert Kern | 1 Jun 02:31 2011
Picon

Re: what does "in" do with numpy arrays?

On Tue, May 31, 2011 at 11:25, Christopher Barker <Chris.Barker <at> noaa.gov> wrote:
> Hi folks,
>
> I've re-titled this thread, as it's about a new question, now:
>
> What does:
>
> something in a_numpy_array
>
> mean? i.e. how has __contains__ been defined?
>
> A couple of us have played with it, and can't make sense of it:
>
>> In [24]: a
>> Out[24]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
>>
>> In [25]: 3 in a
>> Out[25]: True
>>
>> So the simple case works just like a list. But what If I look for an array in another array?
>
>> In [26]: b
>> Out[26]: array([3, 6, 4])
>>
>> In [27]: b in a
>> Out[27]: False
>>
>> OK, so the full b array is not in a, and it doesn't "vectorize" it,
>> either. But:
>>
(Continue reading)

Charles R Harris | 1 Jun 03:08 2011
Picon

New functions.

Hi All,

I've been contemplating new functions that could be added to numpy and thought I'd run them by folks to see if there is any interest.

1) Modified sort/argsort functions that return the maximum k values.
    This is easy to do with heapsort and almost as easy with mergesort.

2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a faster version of nansum possible.

3) Fast medians.


Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Robert Kern | 1 Jun 03:15 2011
Picon

Re: New functions.

On Tue, May 31, 2011 at 20:08, Charles R Harris
<charlesr.harris <at> gmail.com> wrote:
> Hi All,
>
> I've been contemplating new functions that could be added to numpy and
> thought I'd run them by folks to see if there is any interest.
>
> 1) Modified sort/argsort functions that return the maximum k values.
>     This is easy to do with heapsort and almost as easy with mergesort.
>
> 2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a faster
> version of nansum possible.
>
> 3) Fast medians.

+3

--

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Warren Weckesser | 1 Jun 03:18 2011

Re: New functions.



On Tue, May 31, 2011 at 8:08 PM, Charles R Harris <charlesr.harris <at> gmail.com> wrote:
Hi All,

I've been contemplating new functions that could be added to numpy and thought I'd run them by folks to see if there is any interest.

1) Modified sort/argsort functions that return the maximum k values.
    This is easy to do with heapsort and almost as easy with mergesort.



While you're at, how about a function that finds both the max and min in one pass?  (Mentioned previously in this thread: http://mail.scipy.org/pipermail/numpy-discussion/2010-June/051072.html)


2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a faster version of nansum possible.

3) Fast medians.


Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Charles R Harris | 1 Jun 03:26 2011
Picon

Re: New functions.



On Tue, May 31, 2011 at 7:18 PM, Warren Weckesser <warren.weckesser <at> enthought.com> wrote:


On Tue, May 31, 2011 at 8:08 PM, Charles R Harris <charlesr.harris <at> gmail.com> wrote:
Hi All,

I've been contemplating new functions that could be added to numpy and thought I'd run them by folks to see if there is any interest.

1) Modified sort/argsort functions that return the maximum k values.
    This is easy to do with heapsort and almost as easy with mergesort.



While you're at, how about a function that finds both the max and min in one pass?  (Mentioned previously in this thread: http://mail.scipy.org/pipermail/numpy-discussion/2010-June/051072.html)


What should it be called? minmax?

This also brings suggests a function that returns both mean and standard deviation, something that could also be implemented using a more stable algorithm than the current one.
 

2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a faster version of nansum possible.

3) Fast medians.



Other suggestions are welcome. Most of these are of the low hanging fruit variety and shouldn't be too much work.

Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Benjamin Root | 1 Jun 03:31 2011
Picon

Re: New functions.



On Tue, May 31, 2011 at 8:18 PM, Warren Weckesser <warren.weckesser <at> enthought.com> wrote:


On Tue, May 31, 2011 at 8:08 PM, Charles R Harris <charlesr.harris <at> gmail.com> wrote:
Hi All,

I've been contemplating new functions that could be added to numpy and thought I'd run them by folks to see if there is any interest.

1) Modified sort/argsort functions that return the maximum k values.
    This is easy to do with heapsort and almost as easy with mergesort.



While you're at, how about a function that finds both the max and min in one pass?  (Mentioned previously in this thread: http://mail.scipy.org/pipermail/numpy-discussion/2010-June/051072.html)



+1 from myself and probably just about anybody in matplotlib.  If both the maxs and mins are searched during the same run through an array, I would imagine that would result in a noticeable speedup with automatic range finding.

Ben Root
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
David | 1 Jun 03:33 2011
Picon

Re: New functions.

On 06/01/2011 10:08 AM, Charles R Harris wrote:
> Hi All,
>
> I've been contemplating new functions that could be added to numpy and
> thought I'd run them by folks to see if there is any interest.
>
> 1) Modified sort/argsort functions that return the maximum k values.
>      This is easy to do with heapsort and almost as easy with mergesort.
>
> 2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a
> faster version of nansum possible.
>
> 3) Fast medians.

+1 for fast median as well, and more generally fast "linear" (O(kN)) 
order statistics would be nice.

cheers,

David
Charles R Harris | 1 Jun 03:34 2011
Picon

Re: New functions.



On Tue, May 31, 2011 at 7:33 PM, David <david <at> silveregg.co.jp> wrote:
On 06/01/2011 10:08 AM, Charles R Harris wrote:
> Hi All,
>
> I've been contemplating new functions that could be added to numpy and
> thought I'd run them by folks to see if there is any interest.
>
> 1) Modified sort/argsort functions that return the maximum k values.
>      This is easy to do with heapsort and almost as easy with mergesort.
>
> 2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a
> faster version of nansum possible.
>
> 3) Fast medians.

+1 for fast median as well, and more generally fast "linear" (O(kN))
order statistics would be nice.


OK, noob question. What are order statistics?

Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Skipper Seabold | 1 Jun 03:36 2011
Picon

Re: New functions.

On Tue, May 31, 2011 at 9:31 PM, Benjamin Root <ben.root <at> ou.edu> wrote:
>
>
> On Tue, May 31, 2011 at 8:18 PM, Warren Weckesser
> <warren.weckesser <at> enthought.com> wrote:
>>
>>
>> On Tue, May 31, 2011 at 8:08 PM, Charles R Harris
>> <charlesr.harris <at> gmail.com> wrote:
>>>
>>> Hi All,
>>>
>>> I've been contemplating new functions that could be added to numpy and
>>> thought I'd run them by folks to see if there is any interest.
>>>
>>> 1) Modified sort/argsort functions that return the maximum k values.
>>>     This is easy to do with heapsort and almost as easy with mergesort.
>>>
>>
>>
>> While you're at, how about a function that finds both the max and min in
>> one pass?  (Mentioned previously in this thread:
>> http://mail.scipy.org/pipermail/numpy-discussion/2010-June/051072.html)
>>
>>
>
> +1 from myself and probably just about anybody in matplotlib.  If both the
> maxs and mins are searched during the same run through an array, I would
> imagine that would result in a noticeable speedup with automatic range
> finding.
>

I don't know if it's one pass off the top of my head, but I've used
percentile for interpercentile ranges.

[docs]
[1]: X = np.random.random(1000)

[docs]
[2]: np.percentile(X,[0,100])
[2]: [0.00016535235312509222, 0.99961513543316571]

[docs]
[3]: X.min(),X.max()
[3]: (0.00016535235312509222, 0.99961513543316571)

Skipper
David | 1 Jun 03:39 2011
Picon

Re: New functions.

On 06/01/2011 10:34 AM, Charles R Harris wrote:
>
>
> On Tue, May 31, 2011 at 7:33 PM, David <david <at> silveregg.co.jp
> <mailto:david <at> silveregg.co.jp>> wrote:
>
>     On 06/01/2011 10:08 AM, Charles R Harris wrote:
>      > Hi All,
>      >
>      > I've been contemplating new functions that could be added to
>     numpy and
>      > thought I'd run them by folks to see if there is any interest.
>      >
>      > 1) Modified sort/argsort functions that return the maximum k values.
>      >      This is easy to do with heapsort and almost as easy with
>     mergesort.
>      >
>      > 2) Ufunc fadd (nanadd?) Treats nan as zero in addition. Should make a
>      > faster version of nansum possible.
>      >
>      > 3) Fast medians.
>
>     +1 for fast median as well, and more generally fast "linear" (O(kN))
>     order statistics would be nice.
>
>
> OK, noob question. What are order statistics?

In statistics, order statistics are statistics based on sorted samples, 
median, min and max being the most common:

http://en.wikipedia.org/wiki/Order_statistic

Concretely here, I meant a fast way to compute any rank of a given data 
set, e.g. with the select algorithm. I wanted to do that for some time, 
but never took the time for it,

David

Gmane