James Philbin | 1 Feb 09:17
Picon

Re: searchsorted bug

> Try out latest SVN.  It should have this problem fixed.
Thanks for this. I've realized that for my case, using object arrays
is probably best. I still think that long term it would be good to
allow comparison functions to take different types, so that one could
compare say integer arrays with floating point arrays without doing an
upcast.
Lars Friedrich | 1 Feb 10:57
Picon
Favicon

histogramdd memory needs

Hello,

I use numpy.histogramdd to compute three dimensional histograms with a 
total number of bins in the order of 1e7. It is clear to me, that such a 
histogram will take a lot of memory. For a dtype=N.float64, it will take 
roughly 80 megabytes. However, I have the feeling that during the 
histogram calculation, much more memory is needed. For example, when I 
have data.shape = (8e6, 3) and do a numpy.histogramdd(d, 280), I expect 
a histogram size of (280**3)*8 = 176 megabytes, but during histogram 
calculation the memory need of pythonw.exe in the Windows Task Manager 
increases up to 687 megabytes over the level before histogram 
calculation. When the calculation is done, the mem usage drops down to 
the expected value. I assume this is due to the internal way, 
numpy.histogramdd works. However, when I need to calculate even bigger 
histograms, I cannot do it this way. So I have the following questions:

1) How can I tell histogramdd to use another dtype than float64? My bins 
will be very little populated so an int16 should be sufficient. Without 
normalization, a Integer dtype makes more sense to me.

2) Is there a way to use another algorithm (at the cost of performance) 
that uses less memory during calculation so that I can generate bigger 
histograms?

My numpy version is '1.0.4.dev3937'

Thanks,
Lars

--

-- 
(Continue reading)

Andrea Gavana | 1 Feb 12:28
Picon

[F2PY]: Allocatable Arrays

Hi All,

    I sent a couple of messages to f2py mailing list, but it seems
like my problem has no simple solution so I thought to ask for some
suggestions here.

Basically, I read some huge unformatted binary files which contain
time-step data from a reservoir simulation. I don't know the
dimensions (i.e., lengths) of the vectors I am going to read, and I
find out this information only when I start reading the file. So, I
thought it would be nice to do something like:

1) Declare outputVector as allocatable;
2) Start reading the file;
3) Find the outputVector dimension and allocate it;
4) Read the data in the outputVector;
5) Return this outputVector.

It works when I compile it and build it in Fortran as an executable
(defining a "main" program in my f90 module), but it bombs when I try
to use it from Python with the error:

C:\Documents and Settings\gavana\Desktop\ECLIPSEReader>prova.py
Traceback (most recent call last):
 File "C:\Documents and
Settings\gavana\Desktop\ECLIPSEReader\prova.py", line 3, in <module>
   inteHead, propertyNames, propertyTypes, propertyNumbers =
ECLIPSEReader.init.readinspec("OPT_INJ.INSPEC")
ValueError: failed to create intent(cache|hide)|optional array-- must
have defined dimensions but got (-1,)
(Continue reading)

Lisandro Dalcin | 1 Feb 14:59
Picon
Gravatar

Re: [F2PY]: Allocatable Arrays

Sorry if I'm making noise, my knowledge of fortran is really little,
but in your routine AllocateDummy your are fist allocating and next
deallocating the arrays. Are you sure you can then access the contents
of your arrays after deallocating them?

How much complicated is your binary format? For simple formats, you
can just use numpy to read binary data, I use this sometimes, but
again, for simple formats.

On 2/1/08, Andrea Gavana <andrea.gavana <at> gmail.com> wrote:
> Hi All,
>
>     I sent a couple of messages to f2py mailing list, but it seems
> like my problem has no simple solution so I thought to ask for some
> suggestions here.
>
> Basically, I read some huge unformatted binary files which contain
> time-step data from a reservoir simulation. I don't know the
> dimensions (i.e., lengths) of the vectors I am going to read, and I
> find out this information only when I start reading the file. So, I
> thought it would be nice to do something like:
>
> 1) Declare outputVector as allocatable;
> 2) Start reading the file;
> 3) Find the outputVector dimension and allocate it;
> 4) Read the data in the outputVector;
> 5) Return this outputVector.
>
> It works when I compile it and build it in Fortran as an executable
> (defining a "main" program in my f90 module), but it bombs when I try
(Continue reading)

Andrea Gavana | 1 Feb 15:18
Picon

Re: [F2PY]: Allocatable Arrays

Hi Lisandro,

On Feb 1, 2008 1:59 PM, Lisandro Dalcin wrote:
> Sorry if I'm making noise, my knowledge of fortran is really little,
> but in your routine AllocateDummy your are fist allocating and next
> deallocating the arrays. Are you sure you can then access the contents
> of your arrays after deallocating them?

Thank you for your answer.

Unfortunately it seems that it doesn't matter whether I deallocate
them or not, I still get the compilation warning and I can't access
those variable in any case. It seems like f2py (or python or whatever)
does not like having more than 1 allocatable array inside a MODULE
declaration.

> How much complicated is your binary format?

*Very* complex. The fact is, I already know how to read those files in
Fortran, is the linking with Python via f2py that is driving me mad. I
can't believe no one has used before allocatable arrays as outputs
(whether from a subroutine or from a module).

> On 2/1/08, Andrea Gavana <andrea.gavana <at> gmail.com> wrote:
> > Hi All,
> >
> >     I sent a couple of messages to f2py mailing list, but it seems
> > like my problem has no simple solution so I thought to ask for some
> > suggestions here.
> >
(Continue reading)

Pearu Peterson | 1 Feb 15:45
Picon
Picon
Favicon

Re: [F2PY]: Allocatable Arrays

On Fri, February 1, 2008 1:28 pm, Andrea Gavana wrote:
> Hi All,
>
>     I sent a couple of messages to f2py mailing list, but it seems
> like my problem has no simple solution so I thought to ask for some
> suggestions here.

Sorry, I haven't been around there long time.

> Basically, I read some huge unformatted binary files which contain
> time-step data from a reservoir simulation. I don't know the
> dimensions (i.e., lengths) of the vectors I am going to read, and I
> find out this information only when I start reading the file. So, I
> thought it would be nice to do something like:
>
> 1) Declare outputVector as allocatable;
> 2) Start reading the file;
> 3) Find the outputVector dimension and allocate it;
> 4) Read the data in the outputVector;

looks ok.

> 5) Return this outputVector.

What do you mean by "return"? You cannot return allocatable arrays
as far as comes to using f2py for generating wrappers. However,
you can access allocatable array outputVector if it is module data,
as you do below.

> It works when I compile it and build it in Fortran as an executable
(Continue reading)

Pearu Peterson | 1 Feb 15:49
Picon
Picon
Favicon

Re: [F2PY]: Allocatable Arrays

On Fri, February 1, 2008 4:18 pm, Andrea Gavana wrote:
> Hi Lisandro,
>
> On Feb 1, 2008 1:59 PM, Lisandro Dalcin wrote:
>> Sorry if I'm making noise, my knowledge of fortran is really little,
>> but in your routine AllocateDummy your are fist allocating and next
>> deallocating the arrays. Are you sure you can then access the contents
>> of your arrays after deallocating them?
>
> Thank you for your answer.
>
> Unfortunately it seems that it doesn't matter whether I deallocate
> them or not, I still get the compilation warning and I can't access
> those variable in any case.

You cannot access those becase they are deallocated. Try to
disable deallocate statements in your fortran code.

> It seems like f2py (or python or whatever)
> does not like having more than 1 allocatable array inside a MODULE
> declaration.

This is not true.

>> How much complicated is your binary format?
>
> *Very* complex. The fact is, I already know how to read those files in
> Fortran, is the linking with Python via f2py that is driving me mad. I
> can't believe no one has used before allocatable arrays as outputs
> (whether from a subroutine or from a module).
(Continue reading)

David Huard | 1 Feb 16:08
Picon

Re: histogramdd memory needs

Hi Lars,

[...]

2008/2/1, Lars Friedrich <lfriedri <at> imtek.de>:

1) How can I tell histogramdd to use another dtype than float64? My bins
will be very little populated so an int16 should be sufficient. Without
normalization, a Integer dtype makes more sense to me.

There is no way you'll be able to ask that without tweaking the histogramdd function yourself.  The relevant bit of code is the instantiation of hist :

hist = zeros(nbin.prod(), float)
 

2) Is there a way to use another algorithm (at the cost of performance)
that uses less memory during calculation so that I can generate bigger
histograms?

You could work through your array block by block. Simply fix the range and generate an histogram for each slice of 100k data and sum them up at the end.

The current histogram and histogramdd implementation has the advantage of being general, that is you can work with uniform or non-uniform bins, but it is not particularly efficient, at least for large number of bins (>30).

Cheers,

David

My numpy version is '1.0.4.dev3937'

Thanks,
Lars


--
Dipl.-Ing. Lars Friedrich

Photonic Measurement Technology
Department of Microsystems Engineering -- IMTEK
University of Freiburg
Georges-Köhler-Allee 102
D-79110 Freiburg
Germany

phone: +49-761-203-7531
fax:   +49-761-203-7537
room:  01 088
email: lars.friedrich <at> imtek.de
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Francesc Altet | 1 Feb 19:14

Re: Can not update a submatrix

A Thursday 31 January 2008, Francesc Altet escrigué:
> A Wednesday 30 January 2008, Timothy Hochberg escrigué:
> > [...a fine explanation by Anne and Timothy...]
>
> Ok. As it seems that this subject has interest enough, I went ahead
> and created a small document about views vs copies at:
>
> http://www.scipy.org/Cookbook/ViewsVsCopies

Ooops, I think I've missed the NumPy tutorial:

http://www.scipy.org/Tentative_NumPy_Tutorial

which already talked about copies vs views :-/.  Well, I think my small 
document can complement some parts of the tutorial.  I'll do that as 
soon as I can and remove the recipe from the cookbook.  Sorry for the 
noise.

Cheers,

--

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Robert Kern | 1 Feb 19:39
Picon
Gravatar

Re: [F2PY]: Allocatable Arrays

Pearu Peterson wrote:
> On Fri, February 1, 2008 1:28 pm, Andrea Gavana wrote:
>> Hi All,
>>
>>     I sent a couple of messages to f2py mailing list, but it seems
>> like my problem has no simple solution so I thought to ask for some
>> suggestions here.
> 
> Sorry, I haven't been around there long time.

Are you going to continue not reading the f2py list? If so, you should point 
everyone there to this list and close the list.

--

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

Gmane