1 Sep 2009 06:06

### A faster median (Wirth's method)

```
We recently has a discussion regarding an optimization of NumPy's median
to average O(n) complexity. After some searching, I found out there is a
selection algorithm competitive in speed with Hoare's quick select. It
has the advantage of being a lot simpler to implement. In plain Python:

import numpy as np

def wirthselect(array, k):

""" Niklaus Wirth's selection algortithm """

a = np.ascontiguousarray(array)
if (a is array): a = a.copy()

l = 0
m = a.shape[0] - 1
while l < m:
x = a[k]
i = l
j = m
while 1:
while a[i] < x: i += 1
while x < a[j]: j -= 1
if i <= j:
tmp = a[i]
a[i] = a[j]
a[j] = tmp
i += 1
j -= 1
```

1 Sep 2009 06:50

### Re: A faster median (Wirth's method)

```Sturla Molden skrev:
> We recently has a discussion regarding an optimization of NumPy's median
> to average O(n) complexity. After some searching, I found out there is a
> selection algorithm competitive in speed with Hoare's quick select.
>
> Reference:
> http://ndevilla.free.fr/median/median.pdf

After som more googling, I found this  text by Wirth himself:

The method is described on page 61 (57 in the PDF) as Hoare's quick
select. So it seems it's just a less optimized version than that of
Numerical Receipes, and the first reference (Devillard) was confused.
Anyhow, it still has the advantage of looking nice in Cython and being
very different from the Numerical Receipes code. We can rename
wirthselect to quickselect then. Sorry for the confusion. I should have
checked the source better.

Sturla Molden
```
1 Sep 2009 10:08

### snow leopard and Numeric

```Hello everybody,

I have just upgraded my Mac laptop to snow leopard.
However, I can no more compile Numeric 24.2.

Here is my output:

[MacBook-Pro-di-Stefano:~/Pacchetti/Numeric-24.2] covino% python
setup.py build
running build
running build_py
running build_ext
building 'RNG.RNG' extension
gcc-4.2 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -arch i386 -
arch ppc -arch x86_64 -pipe -IInclude -IPackages/FFT/Include -
IPackages/RNG/Include -I/System/Library/Frameworks/Python.framework/
Versions/2.6/include/python2.6 -c Packages/RNG/Src/ranf.c -o build/
temp.macosx-10.6-universal-2.6/Packages/RNG/Src/ranf.o
Packages/RNG/Src/ranf.c: In function ‘Mixranf’:
Packages/RNG/Src/ranf.c:153: error: conflicting types for ‘gettimeofday’
/usr/include/sys/time.h:210: error: previous declaration of
‘gettimeofday’ was here
Packages/RNG/Src/ranf.c: In function ‘Mixranf’:
Packages/RNG/Src/ranf.c:153: error: conflicting types for ‘gettimeofday’
/usr/include/sys/time.h:210: error: previous declaration of
‘gettimeofday’ was here
Packages/RNG/Src/ranf.c: In function ‘Mixranf’:
Packages/RNG/Src/ranf.c:153: error: conflicting types for ‘gettimeofday’
/usr/include/sys/time.h:210: error: previous declaration of
‘gettimeofday’ was here
```

1 Sep 2009 10:27

### Re: snow leopard and Numeric

```Use Numpy instead of Numeric (no longer supported I think)?

Matthieu

2009/9/1 Stefano Covino <stefano_covino <at> yahoo.it>:
> Hello everybody,
>
> I have just upgraded my Mac laptop to snow leopard.
> However, I can no more compile Numeric 24.2.
>
> Here is my output:
>
> [MacBook-Pro-di-Stefano:~/Pacchetti/Numeric-24.2] covino% python
> setup.py build
> running build
> running build_py
> running build_ext
> building 'RNG.RNG' extension
> gcc-4.2 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -arch i386 -
> arch ppc -arch x86_64 -pipe -IInclude -IPackages/FFT/Include -
> IPackages/RNG/Include -I/System/Library/Frameworks/Python.framework/
> Versions/2.6/include/python2.6 -c Packages/RNG/Src/ranf.c -o build/
> temp.macosx-10.6-universal-2.6/Packages/RNG/Src/ranf.o
> Packages/RNG/Src/ranf.c: In function ‘Mixranf’:
> Packages/RNG/Src/ranf.c:153: error: conflicting types for ‘gettimeofday’
> /usr/include/sys/time.h:210: error: previous declaration of
> ‘gettimeofday’ was here
> Packages/RNG/Src/ranf.c: In function ‘Mixranf’:
> Packages/RNG/Src/ranf.c:153: error: conflicting types for ‘gettimeofday’
> /usr/include/sys/time.h:210: error: previous declaration of
```

1 Sep 2009 12:08

```Hello,
I tried to load a ASCII table into a string array. Unfortunately, this table has
some empty chells

Here it is:
http://www.ncdc.noaa.gov/oa/climate/rcsg/cdrom/ismcs/alphanum.html

After having converted this into a text file I tried this:

\$ np.genfromtxt('alphanum_to-text.txt', dtype=np.str_, delimiter='|',
skiprows=1, missing='')

But I got an error:

1043                         dtype = np.dtype(ttype)
1044             #
-> 1045             output = np.array(data, dtype)
1047                 if dtype.names:

ValueError: setting an array element with a sequence

I sometimes experience this error message with text read in. Could this message
be made more helpful like telling in which line of the input file this occurs?
This could really reduce the trial and error.

What would be the correct way to read this data in?

Thanks,
Timmie
```

1 Sep 2009 12:37

### Re: genfromtext advice

```
On Sep 1, 2009, at 6:08 AM, Tim Michelsen wrote:

> Hello,
> I tried to load a ASCII table into a string array. Unfortunately,
> this table has
> some empty chells
>
> Here it is:
> http://www.ncdc.noaa.gov/oa/climate/rcsg/cdrom/ismcs/alphanum.html
>
> After having converted this into a text file I tried this:
>
> \$ np.genfromtxt('alphanum_to-text.txt', dtype=np.str_, delimiter='|',
> skiprows=1, missing='')
>
> But I got an error:
>
>   1043                         dtype = np.dtype(ttype)
>   1044             #
> -> 1045             output = np.array(data, dtype)
>   1046             if usemask:
>   1047                 if dtype.names:
>
> ValueError: setting an array element with a sequence
>
> I sometimes experience this error message with text read in. Could
> this message
> be made more helpful like telling in which line of the input file
> this occurs?
```

1 Sep 2009 13:18

### Re: genfromtext advice

```> Mmh, perhaps.
Thanks for the quick reply.

> I'll try to see what I can do. Usually, this message
> shows up when one of the lines you have read doesn't have the same
> number of columns as the others.
Could we add this error to the docstring?
As I suggested, It would be helpful to get the line number where this occurs.

>You mentioned that you're skipping
> one line : try skipping a few more and check.
Here is an example:
http://pastebin.com/m508d1d00

If I skip the first 5512 lines everything goes well.

> You're 100% sure that you didn't miss
> something in the conversion ?
Yes. And just did it again.
Just edited the html source and then saved to text. no magic.

> Can you post the result  (or send a link
> to?)
May I add a improvement request and append the file there?

Thanks,
Timmie
```
1 Sep 2009 13:17

### Re: genfromtext advice

```I have tried
\$ awk -F '|' '{if(NF != 12) print NR;}' /tmp/pp.txt
and besides the first 23 lines and the last 3 lines of the file,
also the following have a number of '|' different from 11:
1635
2851
5538
i.e. BIKIN, BENGUERIR and TERESINA AIRPORT.
But I completely agree with you, genfromtxt could print out
the line number and the actual line giving problems.
```
1 Sep 2009 14:07

### Question about np.savez

```Hi,
I know the documentation states that np.savez saves numpy arrays, so my question relates to misusing it.
Before reading the doc in detail, and after reading about pickle and other options to make data
persistent, I passed np.savez a list of ndarrays. It didn't complain, but when I loaded the data back, the
list had been turned into an ndarray. Is this behaviour expected? It did surprise me. Below  there is an example:

In [18]: l
Out[18]:
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18]),
array([ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27])]

In [19]: np.savez('test.npz', l=l)

In [20]: data = np.load('test.npz')

In [21]: l1 = data['l']

In [22]: l1
Out[22]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
[ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
[ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27]])

jorge
```
1 Sep 2009 14:54

### Re: genfromtext advice

```> \$ awk -F '|' '{if(NF != 12) print NR;}' /tmp/pp.txt
> and besides the first 23 lines and the last 3 lines of the file,
> also the following have a number of '|' different from 11:
> 1635
> 2851
> 5538
> i.e. BIKIN, BENGUERIR and TERESINA AIRPORT.
Looks lika some bash magic.

I will try to translate this into as python script...
```

Gmane