Eric Firing | 1 Jul 03:22
Favicon
Gravatar

Re: Memory leaks

John Hunter wrote:
> On 6/30/07, Eric Firing <efiring@...> wrote:
>> Mike,
>>
>> All this sounds like great progress--thanks!  I particularly appreciate
>> the descriptions of what problems you found and how you found them.
>>
>> John et al.: is there a maintainer for each of these backends?  I think
> 
> gtk: Steve Chaplin or me
> wx: Ken McIvor
> qt: Darren?
> tk: Charlie?
> 
> After we get these patches in, we can just give Michael commit
> privileges :-)  I can probably look at this Monday, but if you want to
> commit and test some of these before then, please do so.

Done.  It looks like there is still plenty of memory leakage, but there 
are improvements, and the huge list of uncollectable garbage with tkAgg 
is gone.

I also made memleak_gui.py more flexible with arguments. For example, 
here are tests with three backends, a generous number of loops, and 
suppression of intermediate output:

python ../unit/memleak_gui.py -d wx -s 500 -e 1000 -q

uncollectable list: []

(Continue reading)

Norbert Nemec | 1 Jul 08:23
Picon
Picon

Re: Checked in major reorganization of __init__.py

Hmm - let me think.... We already have
    rc
    rcParams
    rc_params
    rcdefaults
    rcParamDefaults
    defaultParams
in the main module of maplotlib

How about calling the new module 'rcdefaultparams.py', simply to make
the confusion complete and because I really feel that no other name
would fit the current "naming scheme" better... ;-)

Greetings,
Norbert

John Hunter wrote:
> On 6/30/07, Norbert Nemec <Norbert.Nemec.list@...> wrote:
>   
>> Hi there,
>>
>> I just checked in some major reorganization work in __init__.py
>>
>> The main intention was to move the list of option defaults to a separate
>> file 'rcdefaults.py' that could be imported from setup.py to access the
>> settings with minimal dependencies on the remaining code.
>>     
>
> I haven't tested this but I did take a brief look at it and I think
> your cleaning and organizing is useful.  I think we have a naming
(Continue reading)

Eric Firing | 1 Jul 09:49
Favicon
Gravatar

Re: Checked in major reorganization of __init__.py

Norbert Nemec wrote:
> Hmm - let me think.... We already have
>     rc
>     rcParams
>     rc_params
>     rcdefaults
>     rcParamDefaults
>     defaultParams
> in the main module of maplotlib
> 
> How about calling the new module 'rcdefaultparams.py', simply to make
> the confusion complete and because I really feel that no other name
> would fit the current "naming scheme" better... ;-)

Yes, it is confusing, there are too many similar names.  I suspect some 
are used infrequently enough that we could change them without too much 
pain.

But the new module is really two things: 1) rc utilities (mainly 
validation facilities) and 2) a set of default values.  If these are 
kept together the module could be called "rc_init.py" because everything 
is mainly used for rc initialization, although there are things still in 
mpl's __init__.py that are also part of the rc initialization.  Or it 
could be called "rc_utils.py" or "rcsetup.py".  I would prefer any of 
these to rcdefaultparams.py.

Furthermore, even after factoring out the rc things as you have done the 
mpl namespace is badly cluttered with things like checkdep_dvipng, 
(which is actually part of the rc validation, so maybe it should be in 
your new module) so still more refactoring and/or renaming might be in 
(Continue reading)

Michael Droettboom | 2 Jul 14:47

Re: Memory leaks

Eric Firing wrote:
> So, this test is still showing problems, with similar memory 
> consumption in these three backends.
Not necessarily.  By default, Python allocates large pools from the 
operating system and then manages those pools itself (though its 
PyMalloc call).  Prior to Python 2.5, those pools were never freed.  
With Python 2.5, empty pools, when they occur, are freed back to the 
OS.  Due to fragmentation issues, even if there is enough free space in 
those pools for new objects, new pools may need to be created anyway, 
since Python objects can't be moved once they are created.  So seeing 
modest increases in memory usage during a long-running Python 
application is typical, and not something that can be avoided 
wiinaccurate at finding memory leaksthout micro-optimizing for pool 
performance (something that may be very difficult).  If memory usage is 
truly increasing in an unbounded way, then, yes, there may be problems, 
but it should eventually stabilize (though in a test such as memleak_gui 
that may take many iterations).  It's more interesting to see the curve 
of memory usage over time than the average over a number of iterations.

For further reading, see:
http://evanjones.ca/python-memory.html
README.valgrind in the Python source
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

Because of this, using the total memory allocated by the Python process 
to track memory leaks is pretty blunt tool.  More important metrics are 
the total number of GC objects (gc.get_objects()), GC garbage 
(gc.garbage), and using a tool like Valgrind or Purify to find 
mismatched malloc/frees.  Another useful tool (but I didn't resort to 
yet with matplotlib testing) is to build Python with COUNT_ALLOCS, which 
(Continue reading)

Eric Firing | 2 Jul 20:49
Favicon
Gravatar

Re: Memory leaks

Michael Droettboom wrote:
> Eric Firing wrote:
>> So, this test is still showing problems, with similar memory 
>> consumption in these three backends.
> Not necessarily.  By default, Python allocates large pools from the 
> operating system and then manages those pools itself (though its 
> PyMalloc call).  Prior to Python 2.5, those pools were never freed.  
> With Python 2.5, empty pools, when they occur, are freed back to the 
> OS.  Due to fragmentation issues, even if there is enough free space in 
> those pools for new objects, new pools may need to be created anyway, 
> since Python objects can't be moved once they are created.  So seeing 
> modest increases in memory usage during a long-running Python 
> application is typical, and not something that can be avoided 
> wiinaccurate at finding memory leaksthout micro-optimizing for pool 
> performance (something that may be very difficult).  If memory usage is 
> truly increasing in an unbounded way, then, yes, there may be problems, 
> but it should eventually stabilize (though in a test such as memleak_gui 
> that may take many iterations).  It's more interesting to see the curve 
> of memory usage over time than the average over a number of iterations.

I agree.  I just ran 2000 iterations with GtkAgg, plotted every 10th 
point, and the increase is linear (apart from a little bumpiness) over 
the entire range (not just the last 1000 iterations reported below):

Backend GTKAgg, toolbar toolbar2
Averaging over loops 1000 to 2000
Memory went from 31248k to 35040k
Average memory consumed per loop: 3.7920k bytes

Maybe this is just the behavior of pymalloc in 2.5?
(Continue reading)

Eric Firing | 2 Jul 20:54
Favicon
Gravatar

Re: Checked in major reorganization of __init__.py

Norbert,

Revision 3445 has some very small changes to fix problems resulting from 
your reorganization.  The questions of module and other naming are still 
open.

Eric

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
Michael Droettboom | 2 Jul 22:25

Re: Memory leaks

More results:

I've built and tested a more recent pygtk+ stack.  (glib-2.12, 
gtk+-2.10.9, librsvg-2.16.1, libxml2-2.6.29, pygobject-2.13.1, 
pygtk-2.10.4...).  The good news is that the C-level leaks I was seeing 
in pygtk 2.2 and 2.4 are resolved.  In particular, using an SVG icon and 
Gdk rendering no longer seems problematic.  I would suggest that anyone 
using old versions of pygtk should upgrade, rather than spending time on 
workarounds for matplotlib -- do you all agree?  And my Gtk patch should 
probably be reverted to use an SVG icon for the window again (or to only 
do it on versions of pygtk > 2.xx).  I don't know what percentage of 
users are still using pygtk-2.4 and earlier...

There is, however, a new patch (attached) to fix a leak of 
FileChooserDialog objects that I didn't see in earlier pygtk versions.  
I have to admit that I'm a bit puzzled by the solution -- it seems that 
the FileChooserDialog object refuses to destruct whenever any custom 
Python attributes have been added to the object.  It doesn't really need 
them in this case so it's an easy fix, but I'm not sure why that was 
broken -- other classes do this and don't have problems (e.g. 
NavigationToolbar2GTK).  Maybe a pygtk expert out there knows what this 
is about.  It would be great if this resolved the linear memory growth 
that Eric is seeing with the Gtk backend.

GtkCairo seems to be free of leaks.

QtAgg (qt-3.3) was leaking because of a cyclical reference in the 
signals between the toolbar and its buttons.  (Patch attached).

Qt4 is forthcoming (I'm still trying to compile something that runs the 
(Continue reading)

Michael Droettboom | 2 Jul 22:27

Re: Memory leaks

Forgot to attach the patches.

Oops,
Mike

Michael Droettboom wrote:
> More results:
>
> I've built and tested a more recent pygtk+ stack.  (glib-2.12, 
> gtk+-2.10.9, librsvg-2.16.1, libxml2-2.6.29, pygobject-2.13.1, 
> pygtk-2.10.4...).  The good news is that the C-level leaks I was seeing 
> in pygtk 2.2 and 2.4 are resolved.  In particular, using an SVG icon and 
> Gdk rendering no longer seems problematic.  I would suggest that anyone 
> using old versions of pygtk should upgrade, rather than spending time on 
> workarounds for matplotlib -- do you all agree?  And my Gtk patch should 
> probably be reverted to use an SVG icon for the window again (or to only 
> do it on versions of pygtk > 2.xx).  I don't know what percentage of 
> users are still using pygtk-2.4 and earlier...
>
> There is, however, a new patch (attached) to fix a leak of 
> FileChooserDialog objects that I didn't see in earlier pygtk versions.  
> I have to admit that I'm a bit puzzled by the solution -- it seems that 
> the FileChooserDialog object refuses to destruct whenever any custom 
> Python attributes have been added to the object.  It doesn't really need 
> them in this case so it's an easy fix, but I'm not sure why that was 
> broken -- other classes do this and don't have problems (e.g. 
> NavigationToolbar2GTK).  Maybe a pygtk expert out there knows what this 
> is about.  It would be great if this resolved the linear memory growth 
> that Eric is seeing with the Gtk backend.
>
(Continue reading)

John Hunter | 2 Jul 22:35
Picon
Gravatar

Re: Memory leaks

On 7/2/07, Michael Droettboom <mdroe@...> wrote:
> Forgot to attach the patches.

Michael -- if you send me your sf ID I'll add you to the committers
list and you can check these in directly.

Vis-a-vis the gtk question, I agree that we should encourage people to
upgrade who are suffering from the leak rather than work around it.  I
would like to summarize the status of known leaks for the FAQ so
perhaps you could summarize across the backends what kind of leaks
remain in the --without-pymalloc with the known problems fixed (eg the
gtk upgrade).  If you could simply send me an update for the memory
leak FAQ (don't worry about the formatting, I can take care of that)
that would be great.  Or if you are feeling doubly adventurous, you
can simply update the FAQ in the htdocs/faq.html.template svn document
and commit it along with your other changes.

Thanks for all the very useful and detailed work!

JDH

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
Michael Droettboom | 2 Jul 22:37

Re: Memory leaks

John Hunter wrote:
> On 7/2/07, Michael Droettboom <mdroe@...> wrote:
>> Forgot to attach the patches.
>
> Michael -- if you send me your sf ID I'll add you to the committers
> list and you can check these in directly.
mdboom
> Vis-a-vis the gtk question, I agree that we should encourage people to
> upgrade who are suffering from the leak rather than work around it.  I
> would like to summarize the status of known leaks for the FAQ so
> perhaps you could summarize across the backends what kind of leaks
> remain in the --without-pymalloc with the known problems fixed (eg the
> gtk upgrade).  If you could simply send me an update for the memory
> leak FAQ (don't worry about the formatting, I can take care of that)
> that would be great.  Or if you are feeling doubly adventurous, you
> can simply update the FAQ in the htdocs/faq.html.template svn document
> and commit it along with your other changes.
Will do.

Cheers,
Mike

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

Gmane