Robert VERGNES | 1 Sep 07:31
Picon
Favicon

RE : Re: RE : Re: RE : Re: RE : Re: How to free unused memory by Python

All my example where in Python 2.5.1 so it is NOT solved it seems...unfortunately. Any workaround ?


Gael Varoquaux <gael.varoquaux <at> normalesup.org> a écrit :

On Fri, Aug 31, 2007 at 01:49:10PM +0200, Robert VERGNES wrote:
> Used memory in linux or windows is displayed on by the windows task
> manager ( win) (ctrl+alt+del) or by the system memory manager (or Task
> Manager) ( depending on your linux version i Think). So you can see how
> much ofyour physical memory is used while running progs.

> So apprently gc cannot redeem memory to the OS... so it seems without
> solution for the moment - apart from out-process the task which load
> memory too much. And kill it each it when it has done its work so the
> memory is given back to the OS.

> Any other ideas ?

Use python 2.5, where this problem is solved ?

Gaël
_______________________________________________
SciPy-user mailing list
SciPy-user <at> scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user

Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail
_______________________________________________
SciPy-user mailing list
SciPy-user <at> scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user
Robert VERGNES | 1 Sep 07:43
Picon
Favicon

Re: How to free unused memory by Python

Anne,

Yes the issue is related with many numpy arrays ( not especially small>2 to 7million items in array). And I do have a crash usually while creating a new array. (MemoryError). To check this out, I made a small test to understand how memory is working in Python and got to see that even with a 'mylist=arange()' the memory is not freed back to the OS when 'mylist' is deleted...which triggered my original question ' How to free unused memory ..'. But as I read from you and other guys, the only way out of this issue - ie to avoid crash -probably due to malloc()- then I must free memory before and for that I need to process out my recurring calculation process which is memory heavy temporarily and must kill my process to release memory after work...
I did notice that if I use huge list -and only a standard python list- , then yes the OS pages normally the memory but when I mix list and numpy arrays are involved than I do have  a crash when I run near the limit of my physical memory -  no more paging possible....and a MemoryError crash happens. Probably due to the way malloc() request the memory for the numpy array...

Thanx for the help.

Robert

Anne Archibald <peridot.faceted <at> gmail.com> a écrit :

On 31/08/2007, Robert VERGNES wrote:
> Used memory in linux or windows is displayed on by the windows task manager
> ( win) (ctrl+alt+del) or by the system memory manager (or Task Manager) (
> depending on your linux version i Think). So you can see how much ofyour
> physical memory is used while running progs.
>
> So apprently gc cannot redeem memory to the OS... so it seems without
> solution for the moment - apart from out-process the task which load memory
> too much. And kill it each it when it has done its work so the memory is
> given back to the OS.
>
> Any other ideas ?

Make sure you have lots of swap space. If python has freed some
memory, python will reuse that before requesting more from the OS, so
there's no problem of memory use growing without bound. If you don't
reuse the memory, it will just sit there unused. If you run into
memory pressure from other applications, the OS (well, most OSes) will
page it out to disk until you actually use it again. So a python
process that has a gigabyte allocated but is only using a hundred
megabytes of that will, if something else wants to use some of the
physical RAM in your machine, simply occupy nine hundred megabytes in
your swap file. Who cares?

Also worth knowing is that even on old versions of python, on some
OSes (probably all) numpy arrays suffer from this problem to a much
lesser degree. When you allocate a numpy array, there's a relatively
small python object describing it, and a chunk of memory to contain
the values. This chunk of memory is allocated with malloc(). The
malloc() implementation on Linux (and probably on other systems)
provides big chunks by requesting them directly from the operating
system, so that they can be returned to the OS when done.

Even if you're using many small arrays, you should be aware that the
memory needed by numpy array data is allocated by malloc() and not
python's allocators, so whether it is freed back to the system is a
separate question from whether the memory needed by python objects
goes back to the system.

Anne
_______________________________________________
SciPy-user mailing list
SciPy-user <at> scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user
.vergnes <at> yahoo.fr>

Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail
_______________________________________________
SciPy-user mailing list
SciPy-user <at> scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user
Gael Varoquaux | 1 Sep 10:04
Favicon
Gravatar

Re: How to free unused memory by Python

Can you send us a simplified version of your code, reflecting the way
you use both numpy arrays and lists, that triggers the crash? We can have
a look at the problem, that way.

Gaƫl

On Sat, Sep 01, 2007 at 07:43:01AM +0200, Robert VERGNES wrote:
>    Yes the issue is related with many numpy arrays ( not especially small>2
>    to 7million items in array). And I do have a crash usually while creating
>    a new array. (MemoryError). To check this out, I made a small test to
>    understand how memory is working in Python and got to see that even with a
>    'mylist=arange()' the memory is not freed back to the OS when 'mylist' is
>    deleted...which triggered my original question ' How to free unused memory
>    ..'. But as I read from you and other guys, the only way out of this issue
>    - ie to avoid crash -probably due to malloc()- then I must free memory
>    before and for that I need to process out my recurring calculation process
>    which is memory heavy temporarily and must kill my process to release
>    memory after work...
>    I did notice that if I use huge list -and only a standard python list- ,
>    then yes the OS pages normally the memory but when I mix list and numpy
>    arrays are involved than I do have  a crash when I run near the limit of
>    my physical memory -  no more paging possible....and a MemoryError crash
>    happens. Probably due to the way malloc() request the memory for the numpy
>    array...
Lorenzo Isella | 1 Sep 14:58
Picon

Again on Double Precision

Dear All,
I know this is related to a thread which had been going on for a while, 
but I am about to publish some results of a simulation making use of 
integrate.odeint and I would like to be sure I have not misunderstood 
anything fundamental.
I was using all my arrays and functions to be dealt with by 
integrate.odeint without ever bothering too much about the details, i.e. 
I never specified explicitly the  "type"  of arrays I was using..
I assumed that  integrate.odeint  was a thin layer to some Fortran 
routine and it would automatically convert to Fortran double-precision 
all the due quantities.
Is this what is happening really? I actually have no reason to think 
that my results are somehow inaccurate, but a you never know.
I was getting worried after looking at:
http://www.scipy.org/Cookbook/BuildingArrays

Apologies if this is too basic for the forum, but in Fortran I always  
used double precision as a standard and  in R all the  numbers/arrays 
are  stored as double precision  objects and you do not have to worry 
(practically the only languages I use apart from Python). In the end of 
the day, double precision is a specific case of floating point numbers 
and I wonder if, when working with the default floating arrays in SciPy, 
I attain the same accuracy  I would get  with double-precision  Fortran 
arrays.
Many thanks for any enlightening  comment.

Lorenzo
Robert Kern | 1 Sep 21:16
Picon
Gravatar

Re: Again on Double Precision

Lorenzo Isella wrote:
> Dear All,
> I know this is related to a thread which had been going on for a while, 
> but I am about to publish some results of a simulation making use of 
> integrate.odeint and I would like to be sure I have not misunderstood 
> anything fundamental.
> I was using all my arrays and functions to be dealt with by 
> integrate.odeint without ever bothering too much about the details, i.e. 
> I never specified explicitly the  "type"  of arrays I was using..
> I assumed that  integrate.odeint  was a thin layer to some Fortran 
> routine and it would automatically convert to Fortran double-precision 
> all the due quantities.
> Is this what is happening really? I actually have no reason to think 
> that my results are somehow inaccurate, but a you never know.
> I was getting worried after looking at:
> http://www.scipy.org/Cookbook/BuildingArrays
> 
> Apologies if this is too basic for the forum, but in Fortran I always  
> used double precision as a standard and  in R all the  numbers/arrays 
> are  stored as double precision  objects and you do not have to worry 
> (practically the only languages I use apart from Python). In the end of 
> the day, double precision is a specific case of floating point numbers 
> and I wonder if, when working with the default floating arrays in SciPy, 
> I attain the same accuracy  I would get  with double-precision  Fortran 
> arrays.

The default floating point type in Python, numpy, and scipy is double-precision.
Unless if you have explicitly constructed arrays using float32, your
calculations will be done in double-precision.

--

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
Anne Archibald | 1 Sep 23:25
Picon

Re: How to free unused memory by Python

On 01/09/07, Robert VERGNES <robert.vergnes <at> yahoo.fr> wrote:

> Yes the issue is related with many numpy arrays ( not especially small>2 to
> 7million items in array). And I do have a crash usually while creating a new
> array. (MemoryError). To check this out, I made a small test to understand
> how memory is working in Python and got to see that even with a
> 'mylist=arange()' the memory is not freed back to the OS when 'mylist' is
> deleted...which triggered my original question ' How to free unused memory
> ..'. But as I read from you and other guys, the only way out of this issue -
> ie to avoid crash -probably due to malloc()- then I must free memory before
> and for that I need to process out my recurring calculation process which is
> memory heavy temporarily and must kill my process to release memory after
> work...
> I did notice that if I use huge list -and only a standard python list- ,
> then yes the OS pages normally the memory but when I mix list and numpy
> arrays are involved than I do have  a crash when I run near the limit of my
> physical memory -  no more paging possible....and a MemoryError crash
> happens. Probably due to the way malloc() request the memory for the numpy
> array...

There are two problems here:
* python or numpy not shinking its virtual memory use
* python crashing during allocation

Why do you think they are related?

The first is a known limitation of most dynamic memory allocation
schemes. Modern malloc()s are generally pretty good about avoiding
memory fragmentation, but the ways in which you can release memory
back to the operating system are often extremely limited. This
problem, that the virtual memory size of a process may remain large
even when most of that memory is unused, arises in raw C programs as
well.

That said, I know that for arrays that large, when they are freed the
glibc malloc() that is used under Linux will definitely release that
memory back to the OS. Are you sure the arrays are actually being
freed? Remember that numpy often creates views of arrays that avoid
copying the data but keep the original array alive:
A = ones(1000000)
B = A[2:4]
del A
Here the memory for A cannot be deallocated because B still points to
it, even though B only needs a few bytes of the many megabytes in A.
To cure this there are various choices, for example:
B = copy(B)
This duplicates the memory and forgets the reference to A (as it is no
longer needed).

As for the crashing, what sort of crash is it? What exception gets
raised (or is it a segfault)?

If it is memory exhaustion, all this business about "not freeing
memory back to the OS" is a red herring. No matter how old your
version of python and how little memory it ever releases back to the
OS, new objects will be allocated from the memory the python process
already has. If your process keeps growing indefinitely, that's not
malloc, that's your code keeping references to more and more data so
that it cannot be free()d. Perhaps look into tools for debugging
memory leaks in python?

Anne
David Goldsmith | 2 Sep 04:35
Picon
Favicon

Re: Again on Double Precision

Robert Kern wrote:
> Lorenzo Isella wrote:
>   
>> Dear All,
>> I know this is related to a thread which had been going on for a while, 
>> but I am about to publish some results of a simulation making use of 
>> integrate.odeint and I would like to be sure I have not misunderstood 
>> anything fundamental.
>> I was using all my arrays and functions to be dealt with by 
>> integrate.odeint without ever bothering too much about the details, i.e. 
>> I never specified explicitly the  "type"  of arrays I was using..
>> I assumed that  integrate.odeint  was a thin layer to some Fortran 
>> routine and it would automatically convert to Fortran double-precision 
>> all the due quantities.
>> Is this what is happening really? I actually have no reason to think 
>> that my results are somehow inaccurate, but a you never know.
>> I was getting worried after looking at:
>> http://www.scipy.org/Cookbook/BuildingArrays
>>
>> Apologies if this is too basic for the forum, but in Fortran I always  
>> used double precision as a standard and  in R all the  numbers/arrays 
>> are  stored as double precision  objects and you do not have to worry 
>> (practically the only languages I use apart from Python). In the end of 
>> the day, double precision is a specific case of floating point numbers 
>> and I wonder if, when working with the default floating arrays in SciPy, 
>> I attain the same accuracy  I would get  with double-precision  Fortran 
>> arrays.
>>     
>
> The default floating point type in Python, numpy, and scipy is double-precision.
> Unless if you have explicitly constructed arrays using float32, your
> calculations will be done in double-precision.
>   
But, if _all_ the array elements are integers (numerically speaking), 
then he has to specify that the array elements are float in some 
concrete way (be it w/ an otherwise superfluous decimal point, a 
dtype=double, or whatever), correct?

DG
stefano borini | 2 Sep 17:57
Picon

dstevx not implemented

Good morning to all,

I just started tinkering with SciPy, and it looks very interesting. 
However, I have a question and the searches I performed on the mailing 
list or the web returned nothing valuable.

I want to find some of the eigenvalues and eigenvectors of a quite large 
tridiagonal matrix (around 20000x20000, maybe more). I used to do this 
task in fortran using dstevx, but I noted that this function is not 
exported to python, according to a comment in flapack_esv.pyf.src .

I would like to ask is:
- is this is due to technical or human-resources issues?
- If any other of you had my same need, how did you solve the problem so 
to have an efficient code?

Thanks in advance

--

-- 
Kind regards,

Stefano Borini
Robert Kern | 2 Sep 21:52
Picon
Gravatar

Re: dstevx not implemented

stefano borini wrote:
> Good morning to all,
> 
> I just started tinkering with SciPy, and it looks very interesting. 
> However, I have a question and the searches I performed on the mailing 
> list or the web returned nothing valuable.
> 
> I want to find some of the eigenvalues and eigenvectors of a quite large 
> tridiagonal matrix (around 20000x20000, maybe more). I used to do this 
> task in fortran using dstevx, but I noted that this function is not 
> exported to python, according to a comment in flapack_esv.pyf.src .
> 
> I would like to ask is:
> - is this is due to technical or human-resources issues?

Human resources.

--

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
massimo sandal | 3 Sep 11:22
Picon
Favicon

Re: How to free unused memory by Python

Anne Archibald ha scritto:
> If it is memory exhaustion, all this business about "not freeing
> memory back to the OS" is a red herring. No matter how old your
> version of python and how little memory it ever releases back to the
> OS, new objects will be allocated from the memory the python process
> already has. If your process keeps growing indefinitely, that's not
> malloc, that's your code keeping references to more and more data so
> that it cannot be free()d. Perhaps look into tools for debugging
> memory leaks in python?

I'd love to find one. I have memory leaks here and there in my code (no 
doubt due to dangling references) but it is often damn hard to debug 
them. I asked on comp.lang.python but I found no useful answers. If you 
know of a memory debugging tool for python, let us know!

m.

-- 
Massimo Sandal
University of Bologna
Department of Biochemistry "G.Moruzzi"

snail mail:
Via Irnerio 48, 40126 Bologna, Italy

email:
massimo.sandal <at> unibo.it

tel: +39-051-2094388
fax: +39-051-2094387
Attachment (massimo.sandal.vcf): text/x-vcard, 274 bytes
_______________________________________________
SciPy-user mailing list
SciPy-user <at> scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user

Gmane