kalyan reddy | 1 Dec 2008 11:42
Picon

Reg port to symbian

 
HI,
 
My name is Kalyan. I was looking for a leak detection tool for qt. Could your tool be ported on to qt-s60.( QT , s60 flavour )
 
Regards,
Kalyan
_______________________________________________
Gc mailing list
Gc@...
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Boehm, Hans | 2 Dec 2008 01:41
Picon
Favicon

RE: Internal memory leak for small objects

Oops.  Another bug that snuck in around 7.0.  The collector was dropping some objects that it decided were too
expensive to reclaim.  That's a fine strategy, but not with GC_find_leak set ...

Thanks for the nice test case.

Fixed in CVS. See http://bdwgc.cvs.sourceforge.net/viewvc/bdwgc/bdwgc/reclaim.c?r1=1.8&r2=1.9 .

Hans

> -----Original Message-----
> From: gc-bounces@...
> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> Sent: Friday, November 28, 2008 6:27 AM
> To: gc@...
> Subject: [Gc] Internal memory leak for small objects
>
> Hi!
>
> Consider this test app:
>
> #include <stdio.h>
> #include "gc.h"
>
> void *ptr = 0;
> int main(void)
> {
>   int j;
>   GC_INIT();
>   for (j = 60; j < 90; ++j) {
>     void **p = GC_MALLOC(j*sizeof(void*));
>     printf("%p\n",p);
>     *p = ptr;
>     ptr = p;
>     GC_gcollect();
>   }
>   return 0;
> }
>
> The GC lib is built in config -DFIND_LEAK
> -DALL_INTERIOR_POINTERS (for simplicity). The test is
> compiled without any options. Here I use VC++ but the bug is
> platform-independent.
>
> Running this test gives:
> 00180E88
> 00180D90
> 00181F00
> 00181E00
> 00182E40
> 00182D10
> 00182BE0
> 00182AB0
> 00182980
> 00182850
> 00182720
> 001825F0
> 001824C0
> 00182390
> 00182260
> 00182130
> 00183E60
> 00183CF0
> 00183B80
> 00183A10
> 001838A0
> 00183730
> 001835C0
> 00183450
> 001832E0
> 00183170
> 00184E60
> 00183000
> 00184CF0
> 00184B80
>
> Log file contains:
> Leaked composite object at start: 00182000, appr. length: 304
> Leaked composite object at start: 00182000, appr. length: 304
> Leaked composite object at start: 00182000, appr. length: 304
> Leaked composite object at start: 00182000, appr. length: 304
> Leaked composite object at start: 00182000, appr. length: 304
> Leaked composite object at start: 00183000, appr. length: 368
> Leaked composite object at start: 00182000, appr. length: 304
> Leaked composite object at start: 00182000, appr. length: 304
>
> You can see that leaked object is:
> 1. SMALL_OBJ;
> 2. never returned by GC_malloc.
>
> I don't know whether the bug is specific to FIND_LEAK mode or not.
> In fact, this object is removed from GC_obj_kinds list by
> GC_start_reclaim(FALSE) instead of being reconstructed. I'm
> failed to find out more...
>
> Bye.
>
> _______________________________________________
> Gc mailing list
> Gc@...
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>
Boehm, Hans | 2 Dec 2008 01:54
Picon
Favicon

RE: Memory leak in GC_allochblk_nth

> -----Original Message-----
> From:  Ivan Maidanski
> Sent: Tuesday, November 25, 2008 11:57 AM
> To: gc@...
> Subject: [Gc] Memory leak in GC_allochblk_nth
>
> Hi!
>
> My problem is as follows...
> My (demo/test) application allocates several large (>300K)
> atomic objects, drops them quickly and repeats this
> alocation/dropping cycle many times - as a result I observer
> memory leakage (together with the repeated GC warning about
> it) and, then, some time later, out of virtual memory. Naive
> removal of the code block around WARN("Repeated allocation of
> very large block...") (together with "size_avail =
> orig_avail;") doesn't help at all (even more it results in
> "unreasonable heap growth").
>
> So, there are several possible (and working) solutions: turn
> off all_interior_pointers mode and/or use "ignore_off_page"
> versions of GC_malloc[_atomic]() (or choose between normal
> and "ignore_off_page" versions dynamically). But these
> solutions require addition of "volatile" qualifier for
> pointers (to prevent compiler optimization) which is not
> acceptable to me.
"Volatile" is a bit ill-defined.  But it should be fine to just keep both volatile and non-volatile copies of
the pointer to such a large object, accessing it through the non-volatile one, and making sure the GC can
see the volatile one.  Thus this shouldn't incur substantial overhead.  You just want to adequately
discourage the compiler from optimizing away the last pointer to the base of the object.
>
> My questions:
> 1. Is it possible to fix GC_allochblk_nth() to prevent memory
> leak for GC_malloc with all_interior_pointers mode on (even
> if it results in "unreasonable heap growth")?
Not that I know of.  This "leak" is normally just a consequence of allocating an object that is too large to fit
between known "false pointers" that already "point" to unoccupied sections of the heap.  Some of these
"false pointers" may be due to the collector not cleaning up after itself, and leaving values lying around
that look like pointers.  By fixing these, you might be able to allocate somewhat larger objects before you
run into the problem.

Switching to a 64-bit ABI should be a pretty complete fix.  This is really a dense address space issue.

> 2. What's the difference between "normal" and
> "ignore_off_page" versions of GC_malloc if
> all_interior_pointers mode if off? (and what's the difference
> between all_interior_pointers mode on and off for
> GC_malloc_ignore_off_page?)
Even with all_interior_pointers mode off, the collector still looks for interior pointers from the stack
and registers,since those often result from compiler optimizations.  Pointers into ignore_off_page
objects are ignored even if they originate from a register.

Hans
>
> Bye.
> _______________________________________________
> Gc mailing list
> Gc@...
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>
Ivan Maidanski | 2 Dec 2008 13:25
Picon

Re[2]: Memory leak in GC_allochblk_nth

Hi!

"Boehm, Hans" <hans.boehm@...> wrote:
> > -----Original Message-----
> > From:  Ivan Maidanski
> > Sent: Tuesday, November 25, 2008 11:57 AM
> > To: gc@...
> > Subject: [Gc] Memory leak in GC_allochblk_nth
> >
> > Hi!
> >
> > ...
> > My questions:
> >  1. Is it possible to fix GC_allochblk_nth() to prevent memory
> > leak for GC_malloc with all_interior_pointers mode on (even
> > if it results in "unreasonable heap growth")?
> Not that I know of.  This "leak" is normally just a consequence of allocating an object that is too large to
fit between known "false pointers" that already "point" to unoccupied sections of the heap.  Some of these
"false pointers" may be due to the collector not cleaning up after itself, and leaving values lying around
that look like pointers.  By fixing these, you might be able to allocate somewhat larger objects before you
run into the problem.

But, please say, could this "normal" behavior of GC (you are talking of) result in a memory
(GC_get_heap_size value is around 2GB) overflow (with a 32-bit ABI) for an application (running in
all_interior_poiners mode on (and using normal GC_malloc_atomic (for 300K arrays holding screen
images))) those GC_get_heap_size value floats around 10MB (and is never above 30MB at peaks) if
all_interior_poiners mode turned off or if ignore_off_page allocations are used instead, or if another
collector (without blacklisting) is used?

> 
> Switching to a 64-bit ABI should be a pretty complete fix.  This is really a dense address space issue.

I agree (but I can't test my above sample/demo app in 64-bit mode for now (due to external 32-bit 3rd-party
DLLs used)).

> 
> > 2. What's the difference between "normal" and
> > "ignore_off_page" versions of GC_malloc if
> > all_interior_pointers mode if off? (and what's the difference
> > between all_interior_pointers mode on and off for
> > GC_malloc_ignore_off_page?)
> Even with all_interior_pointers mode off, the collector still looks for interior pointers from the
stack and registers, since those often result from compiler optimizations.  Pointers into
ignore_off_page objects are ignored even if they originate from a register.

Some points to refine:

1. Pointers into the first 256 bytes of ignore_off_page objects are always guaranteed to be recognized if
they originate from a register/stack. Right?

2. If an application always keeps base pointers (without "volatile") for all allocated objects then it
should be always safe to turn all_interior_pointers mode off (regardless of compilation optimization
level). Right?

3. If "even with all_interior_pointers mode off, the collector still looks for interior pointers from the
stack and registers" is true then a tail byte should be always added regardless of all_interior_pointers
(unless DONT_ADD_BYTE_AT_END is defined).

Anyhow, I think, this behavior should be clearly specified in gc.h.

> 
> Hans

Bye.
Boehm, Hans | 4 Dec 2008 02:29
Picon
Favicon

RE: Re[2]: Memory leak in GC_allochblk_nth


> -----Original Message-----
> From: gc-bounces@...
> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> Sent: Tuesday, December 02, 2008 4:25 AM
> To: gc@...
> Subject: Re[2]: [Gc] Memory leak in GC_allochblk_nth
>
> Hi!
>
> "Boehm, Hans" <hans.boehm@...> wrote:
> > > -----Original Message-----
> > > From:  Ivan Maidanski
> > > Sent: Tuesday, November 25, 2008 11:57 AM
> > > To: gc@...
> > > Subject: [Gc] Memory leak in GC_allochblk_nth
> > >
> > > Hi!
> > >
> > > ...
> > > My questions:
> > >  1. Is it possible to fix GC_allochblk_nth() to prevent
> memory leak
> > > for GC_malloc with all_interior_pointers mode on (even if
> it results
> > > in "unreasonable heap growth")?
> > Not that I know of.  This "leak" is normally just a
> consequence of allocating an object that is too large to fit
> between known "false pointers" that already "point" to
> unoccupied sections of the heap.  Some of these "false
> pointers" may be due to the collector not cleaning up after
> itself, and leaving values lying around that look like
> pointers.  By fixing these, you might be able to allocate
> somewhat larger objects before you run into the problem.
>
> But, please say, could this "normal" behavior of GC (you are
> talking of) result in a memory (GC_get_heap_size value is
> around 2GB) overflow (with a 32-bit ABI) for an application
> (running in all_interior_poiners mode on (and using normal
> GC_malloc_atomic (for 300K arrays holding screen images)))
> those GC_get_heap_size value floats around 10MB (and is never
> above 30MB at peaks) if all_interior_poiners mode turned off
> or if ignore_off_page allocations are used instead, or if
> another collector (without blacklisting) is used?
Possibly.  It might be interesting to look at a GC_dump snapshot and the gc log.  If the heap is growing over
time, collections just aren't reclaiming much, and the heap consists primarily of those 300K objects,
then I suspect this is somewhat unavoidable.  You are presumably allocating images with
GC_MALLOC_ATOMIC or the like?  With that, you'll of course get lots of spurious "pointers" in the images
themselves, especially if they're compressed.

>
> >
> > Switching to a 64-bit ABI should be a pretty complete fix.
> This is really a dense address space issue.
>
> I agree (but I can't test my above sample/demo app in 64-bit
> mode for now (due to external 32-bit 3rd-party DLLs used)).
>
> >
> > > 2. What's the difference between "normal" and "ignore_off_page"
> > > versions of GC_malloc if all_interior_pointers mode if off? (and
> > > what's the difference between all_interior_pointers mode
> on and off
> > > for
> > > GC_malloc_ignore_off_page?)
> > Even with all_interior_pointers mode off, the collector
> still looks for interior pointers from the stack and
> registers, since those often result from compiler
> optimizations.  Pointers into ignore_off_page objects are
> ignored even if they originate from a register.
>
> Some points to refine:
>
> 1. Pointers into the first 256 bytes of ignore_off_page
> objects are always guaranteed to be recognized if they
> originate from a register/stack. Right?
Correct.  Really it's pointers to the first heap chunk, but we don't want to export its real value.
>
> 2. If an application always keeps base pointers (without
> "volatile") for all allocated objects then it should be
> always safe to turn all_interior_pointers mode off
> (regardless of compilation optimization level). Right?
Yes.  That's the idea.  Or if it explicitly registers offsets that it uses.
>
> 3. If "even with all_interior_pointers mode off, the
> collector still looks for interior pointers from the stack
> and registers" is true then a tail byte should be always
> added regardless of all_interior_pointers (unless
> DONT_ADD_BYTE_AT_END is defined).
I believe it currently does not do that.  We effectively assume that compiler-generated derived pointers
will point to inside the object.  We do make assumptions about the compiler not hiding pointers.
>
> Anyhow, I think, this behavior should be clearly specified in gc.h.
Agreed :-)
>
> >
> > Hans
>
> Bye.
> _______________________________________________
> Gc mailing list
> Gc@...
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>
Boehm, Hans | 4 Dec 2008 02:48
Picon
Favicon

RE: Re[2]: I_DONT_HOLD_LOCK removal in GC_print_obj and friends

Thanks.  I committed this, minus the undo of the previous patch (which I hadn't applied).

I agree it's not elegant.  But it does fix a bug, though in a rarely used configuration.  And it doesn't make the
code harder to read.

Hans

> -----Original Message-----
> From: gc-bounces@...
> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> Sent: Tuesday, November 25, 2008 2:29 AM
> To: gc@...
> Subject: Re[2]: [Gc] I_DONT_HOLD_LOCK removal in GC_print_obj
> and friends
>
> Hi!
>
> "Boehm, Hans" <hans.boehm@...> wrote:
> > This doesn't look correct to me, since GC_print_callers
> acquires tha allocator lock, and on some platforms does all
> sorts of nasty stuff that might result in allocation with
> REDIRECT_MALLOC.  Is there a way to avoid acquiring the lock
> for this call?  I haven't had a chance to look in detail.
> >
> > Hans
>
> Here is the solution (at least, temporal). I think You don't
> like it neither but it's working (other calls to
> GC_print_heap_obj are safe).
>
> >
> > > -----Original Message-----
> > > From: gc-bounces@...
> > > [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> > > Sent: Monday, October 27, 2008 8:49 AM
> > > To: gc@...
> > > Subject: [Gc] I_DONT_HOLD_LOCK removal in GC_print_obj and friends
> > >
> > > Hi!
> > >
> > > I_DONT_HOLD_LOCK() assertion is violated in
> > > GC_debug_print_heap_obj_proc() (called indirectly from
> > > GC_maybe_gc()) when test.c and gclib are compiled for Win32 with
> > > -DALL_INTERIOR_POINTERS -DGC_THREADS -DGC_ASSERTIONS
> > > -DPRINT_BLACK_LIST -DDBG_HDRS_ALL.
> > >
> > > This assertion can't be replaced wih I_HOLD_LOCK() because it is
> > > violated too (called indirectly from GC_help_marker())
> when test.c
> > > and gclib are compiled for Win32 with -DALL_INTERIOR_POINTERS
> > > -DGC_THREADS -DGC_ASSERTIONS -DPRINT_BLACK_LIST -DDBG_HDRS_ALL
> > > -DPARALLEL_MARK -DAO_ASSUME_WINDOWS98.
> > >
> > > I don't know the purpose of this assertion.
> > > Since it works ok without this assertion (and in
> GC_print_obj(), and
> > > GC_print_smashed_obj()), the attached patch removes these
> > > assertions.
> > >
> > > PS. The same assertion in GC_print_all_smashed_() is ok.
> > >
> > > Bye.
> > >
> > >
> >
>
> Bye.
>
>
Boehm, Hans | 4 Dec 2008 02:55
Picon
Favicon

RE: volatile for GC_clear_stack

Are there compilers for which this helps?  I would expect that it's unlikely that a compiler would remove the
array.  But if it were going to, I'd expect it to still do so so long as the actually accesses are not volatile.

Unless this is known to help, I think I would rather keep the code as it is.  If the array does get eliminated, we
have a space performance, but not correctness, issue.

Hans

> -----Original Message-----
> From: gc-bounces@...
> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> Sent: Friday, November 21, 2008 4:28 AM
> To: gc@...
> Subject: [Gc] volatile for GC_clear_stack
>
> Hi!
>
> This small patch adds "volatile" qualifier for "dummy" arrays
> (together with de-volatile casts) in GC_clear_stack[_inner]().
>
> Bye.
>
Boehm, Hans | 5 Dec 2008 01:40
Picon
Favicon

RE: Small fix for Win32 GC_push_stack_for

Finally getting back to this issue ...

I applied this to my tree, and will check it in.  Thanks.

Hans

> -----Original Message-----
> From: gc-bounces@...
> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> Sent: Friday, October 24, 2008 4:36 AM
> To: gc@...
> Subject: [Gc] Small fix for Win32 GC_push_stack_for
>
> Hi!
>
> This patch removes "Thread stack pointer out of range"
> warning for the stopped thread with sp just got decremented
> but not dereferenced yet (i.e. sp points to the guard page of
> the thread's stack, and, thus, computed stack_min is greater than sp).
>
> Bye.
>
>
Boehm, Hans | 5 Dec 2008 02:45
Picon
Favicon

RE: Re[2]: Back to "GC Stack problem on Win32"

How about something like this at the end of GC_push_stack_for?

This will still repeatedly grow the stack for something like your example.  But, as in the last version, I
think the total number of VirtualQuery calls is bounded by the final stack depth, plus one for each GC.

Unlike my last version, this should typically avoid the GC_get_stack_min calls in your example.

I'm not sure I want to introduce very ugly code to handle something like this example a bit better.  The
collector already does work in proportion to the root size.  And this seems to be another example where a
huge root size causes it to slow down.

(Warning: completely untested:)

     /* Set stack_min to the lowest address in the thread stack,        */
      /* or to an address in the thread stack no larger than sp,        */
      /* taking advantage of the old value to avoid slow traversals     */
      /* of large stacks.                                               */
      if (thread -> last_stack_min == ADDR_LIMIT) {
        stack_min = GC_get_stack_min(thread -> stack_base);
      } else {
        if (sp < thread -> stack_base && sp >= thread -> last_stack_min) {
            stack_min = sp;
        } else {
#         ifdef MSWINCE
            stack_min = GC_get_stack_min(thread -> stack_base);
#         else
            if (GC_may_be_in_stack(thread -> last_stack_min)) {
              stack_min = GC_get_stack_min(thread -> last_stack_min);
            } else {
              /* Stack shrunk?  Is this possible? */
              stack_min = GC_get_stack_min(thread -> stack_base);
            }
#         endif
        }
      }
      thread -> last_stack_min = stack_min;

      if (sp >= stack_min && sp < thread->stack_base) {
#       ifdef DEBUG_THREADS
          GC_printf("Pushing thread from %p to %p for 0x%x from 0x%x\n",
                    sp, thread -> stack_base, (int)thread -> id, (int)me);
#       endif
        GC_push_all_stack(sp, thread->stack_base);
      } else {
        /* If not current thread then it is possible for sp to point to */
        /* the guarded (untouched yet) page just below the current      */
        /* stack_min of the thread.                                     */
        if (thread -> id == me || sp >= thread->stack_base
                || sp + GC_page_size < stack_min)
          WARN("Thread stack pointer 0x%lx out of range, pushing everything\n",
                (unsigned long)(size_t)sp);
        GC_push_all_stack(stack_min, thread->stack_base);
      }
    } /* thread looks live */

Hans

> -----Original Message-----
> From: gc-bounces@...
> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
> Sent: Thursday, November 13, 2008 3:08 PM
> To: gc@...
> Subject: Re[2]: [Gc] Back to "GC Stack problem on Win32"
>
> Hi!
>
> "Boehm, Hans" <hans.boehm@...> wrote:
> > Ivan -
> >
> > Thanks.  However, I don't think I understand the problem
> here correctly. The old code should in nearly all cases only
> invoke GC_may_be_in_stack(thread -> last_stack_min).  This
> should be cheap, since it only walks a page or so of the stack, right?
>
> Right, but only if the stack hasn't grown too much between
> collections.
> By default, Win32 apps has StackCommit==4K. So, if the stack
> grew by 4MB then VirtualQuery() would be called 1000 times
> during nearest collection. But for GC_push_stack_for() only
> one VirtualQuery() is really required - just to check sp value.
>
> >
> > It seems like it would usually result in exactly one extra
> VirtualQuery call, as it walks off the end of the stack.
> Since GC_may_be_in_stack() makes one call anyway, it seems to
> me that it should at most double the time spent there.
>
> Try this test app:
>
> #include "gc.h"
>
> void GC_printf();
>
> int f(int n) {
>  return n > 0 ? f(n - 1) + 1 : 0;
> }
>
> void test(char c, int n) {
>  n = f(n);
>  GC_printf("\n Test%c: N= %d\n\n", c, n);  GC_gcollect(); }
>
> void *obj;
>
> int main(void) {
>  int n;
>  int max = 9 * 1000 * 1000;
>  obj = GC_MALLOC(16);
>  for (n = 100 * 1000; n <= max; n += n >> 1) {
>   test('A',n);
>   test('B',n);
>  }
>  GC_printf("Done");
>  return 0;
> } // end
>
> Compile it with -fno-optimize-sibling-calls -Xlinker --stack
> -Xlinker 0x10000000 Set GC_PRINT_STATS=1 to see the
> world-stopped delays timing.
>
> If You use Your code or just comment out one line "if (sp <
> stack_min || sp >= thread->stack_base)" in
> GC_push_stack_for() then you see how much time is required to
> collect just a few bytes.
>
> >
> > I don't like the reference to last_info in the patch, since
> that relies on side effects of GC_may_be_in_stack that I
> would like to keep as a private implementation detail of
> GC_may_be_in_stack and GC_get_stack_min.  Is there a reason
> not to use thread -> last_stack_min instead of last_info.BaseAddress?
>
> I don't like it too. But to say the truth, "caching" here
> works realy only to pass BaseAddress from
> GC_may_be_in_stack() to GC_get_stack_min() even in case of
> one thread. Another solution (without "caching") is to make
> GC_may_be_in_stack() return BaseAddress or NULL (instead of
> bool) - simple to change (but the func should be renamed, may be).
>
> >
> > Hans
> >
> > > ...
> > > I've already pointed out some bugs.
> > > Now I've just run the same test (having Your patch
> applied) as I had
> > > run for my patch... And it turns out that Yours doesn't solve the
> > > problem as it was originally stated (to say more precisely, it
> > > doesn't reduce time for the first collection after stack growth).
> > >
> > > Look into the things You had advised me before:
> > > > 1) Initially call VirtualQuery on the sp.  If the stack
> > > base is in the same region, we know we're OK, and don't need
> > > GC_get_stack_min.  Hopefully this will be true about 100%
> of the time.
> > >
> > > So I did it for Your code now. It works.
> > >
> > > The patch is attached.
> > >
> > > Bye.
>
> Bye.
>
> _______________________________________________
> Gc mailing list
> Gc@...
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>
Hans Boehm | 5 Dec 2008 07:04
Picon
Favicon

RE: Re[2]: Back to "GC Stack problem on Win32"

Ignore the fact that this doesn't update last_stack_min correctly.
I'll fix that part of the logic.

Hans

On Fri, 5 Dec 2008, Boehm, Hans wrote:

> How about something like this at the end of GC_push_stack_for?
>
> This will still repeatedly grow the stack for something like your example.  But, as in the last version, I
think the total number of VirtualQuery calls is bounded by the final stack depth, plus one for each GC.
>
> Unlike my last version, this should typically avoid the GC_get_stack_min calls in your example.
>
> I'm not sure I want to introduce very ugly code to handle something like this example a bit better.  The
collector already does work in proportion to the root size.  And this seems to be another example where a
huge root size causes it to slow down.
>
> (Warning: completely untested:)
>
>     /* Set stack_min to the lowest address in the thread stack,        */
>      /* or to an address in the thread stack no larger than sp,        */
>      /* taking advantage of the old value to avoid slow traversals     */
>      /* of large stacks.                                               */
>      if (thread -> last_stack_min == ADDR_LIMIT) {
>        stack_min = GC_get_stack_min(thread -> stack_base);
>      } else {
>        if (sp < thread -> stack_base && sp >= thread -> last_stack_min) {
>            stack_min = sp;
>        } else {
> #         ifdef MSWINCE
>            stack_min = GC_get_stack_min(thread -> stack_base);
> #         else
>            if (GC_may_be_in_stack(thread -> last_stack_min)) {
>              stack_min = GC_get_stack_min(thread -> last_stack_min);
>            } else {
>              /* Stack shrunk?  Is this possible? */
>              stack_min = GC_get_stack_min(thread -> stack_base);
>            }
> #         endif
>        }
>      }
>      thread -> last_stack_min = stack_min;
>
>      if (sp >= stack_min && sp < thread->stack_base) {
> #       ifdef DEBUG_THREADS
>          GC_printf("Pushing thread from %p to %p for 0x%x from 0x%x\n",
>                    sp, thread -> stack_base, (int)thread -> id, (int)me);
> #       endif
>        GC_push_all_stack(sp, thread->stack_base);
>      } else {
>        /* If not current thread then it is possible for sp to point to */
>        /* the guarded (untouched yet) page just below the current      */
>        /* stack_min of the thread.                                     */
>        if (thread -> id == me || sp >= thread->stack_base
>                || sp + GC_page_size < stack_min)
>          WARN("Thread stack pointer 0x%lx out of range, pushing everything\n",
>                (unsigned long)(size_t)sp);
>        GC_push_all_stack(stack_min, thread->stack_base);
>      }
>    } /* thread looks live */
>
> Hans
>
>> -----Original Message-----
>> From: gc-bounces@...
>> [mailto:gc-bounces@...] On Behalf Of Ivan Maidanski
>> Sent: Thursday, November 13, 2008 3:08 PM
>> To: gc@...
>> Subject: Re[2]: [Gc] Back to "GC Stack problem on Win32"
>>
>> Hi!
>>
>> "Boehm, Hans" <hans.boehm@...> wrote:
>>> Ivan -
>>>
>>> Thanks.  However, I don't think I understand the problem
>> here correctly. The old code should in nearly all cases only
>> invoke GC_may_be_in_stack(thread -> last_stack_min).  This
>> should be cheap, since it only walks a page or so of the stack, right?
>>
>> Right, but only if the stack hasn't grown too much between
>> collections.
>> By default, Win32 apps has StackCommit==4K. So, if the stack
>> grew by 4MB then VirtualQuery() would be called 1000 times
>> during nearest collection. But for GC_push_stack_for() only
>> one VirtualQuery() is really required - just to check sp value.
>>
>>>
>>> It seems like it would usually result in exactly one extra
>> VirtualQuery call, as it walks off the end of the stack.
>> Since GC_may_be_in_stack() makes one call anyway, it seems to
>> me that it should at most double the time spent there.
>>
>> Try this test app:
>>
>> #include "gc.h"
>>
>> void GC_printf();
>>
>> int f(int n) {
>>  return n > 0 ? f(n - 1) + 1 : 0;
>> }
>>
>> void test(char c, int n) {
>>  n = f(n);
>>  GC_printf("\n Test%c: N= %d\n\n", c, n);  GC_gcollect(); }
>>
>> void *obj;
>>
>> int main(void) {
>>  int n;
>>  int max = 9 * 1000 * 1000;
>>  obj = GC_MALLOC(16);
>>  for (n = 100 * 1000; n <= max; n += n >> 1) {
>>   test('A',n);
>>   test('B',n);
>>  }
>>  GC_printf("Done");
>>  return 0;
>> } // end
>>
>> Compile it with -fno-optimize-sibling-calls -Xlinker --stack
>> -Xlinker 0x10000000 Set GC_PRINT_STATS=1 to see the
>> world-stopped delays timing.
>>
>> If You use Your code or just comment out one line "if (sp <
>> stack_min || sp >= thread->stack_base)" in
>> GC_push_stack_for() then you see how much time is required to
>> collect just a few bytes.
>>
>>>
>>> I don't like the reference to last_info in the patch, since
>> that relies on side effects of GC_may_be_in_stack that I
>> would like to keep as a private implementation detail of
>> GC_may_be_in_stack and GC_get_stack_min.  Is there a reason
>> not to use thread -> last_stack_min instead of last_info.BaseAddress?
>>
>> I don't like it too. But to say the truth, "caching" here
>> works realy only to pass BaseAddress from
>> GC_may_be_in_stack() to GC_get_stack_min() even in case of
>> one thread. Another solution (without "caching") is to make
>> GC_may_be_in_stack() return BaseAddress or NULL (instead of
>> bool) - simple to change (but the func should be renamed, may be).
>>
>>>
>>> Hans
>>>
>>>> ...
>>>> I've already pointed out some bugs.
>>>> Now I've just run the same test (having Your patch
>> applied) as I had
>>>> run for my patch... And it turns out that Yours doesn't solve the
>>>> problem as it was originally stated (to say more precisely, it
>>>> doesn't reduce time for the first collection after stack growth).
>>>>
>>>> Look into the things You had advised me before:
>>>>> 1) Initially call VirtualQuery on the sp.  If the stack
>>>> base is in the same region, we know we're OK, and don't need
>>>> GC_get_stack_min.  Hopefully this will be true about 100%
>> of the time.
>>>>
>>>> So I did it for Your code now. It works.
>>>>
>>>> The patch is attached.
>>>>
>>>> Bye.
>>
>> Bye.
>>
>> _______________________________________________
>> Gc mailing list
>> Gc@...
>> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>>
>
> _______________________________________________
> Gc mailing list
> Gc@...
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>

Gmane