Andrew Morton | 1 May 2009 02:45

Re: [PATCH] vmscan: evict use-once pages first (v2)

On Thu, 30 Apr 2009 00:20:58 -0700
Elladan <elladan <at> eskimo.com> wrote:

> > Elladan, does this smaller patch still work as expected?
> 
> Rik, since the third patch doesn't work on 2.6.28 (without disabling a lot of
> code), I went ahead and tested this patch.
> 
> The system does seem relatively responsive with this patch for the most part,
> with occasional lag.  I don't see much evidence at least over the course of a
> few minutes that it pages out applications significantly.  It seems about
> equivalent to the first patch.
> 
> Given Andrew Morton's request that I track the Mapped: field in /proc/meminfo,
> I went ahead and did that with this patch built into a kernel.  Compared to the
> standard Ubuntu kernel, this patch keeps significantly more Mapped memory
> around, and it shrinks at a slower rate after the test runs for a while.
> Eventually, it seems to reach a steady state.
> 
> For example, with your patch, Mapped will often go for 30 seconds without
> changing significantly.  Without your patch, it continuously lost about
> 500-1000K every 5 seconds, and then jumped up again significantly when I
> touched Firefox or other applications.  I do see some of that behavior with
> your patch too, but it's much less significant.

Were you able to tell whether altering /proc/sys/vm/swappiness appropriately
regulated the rate at which the mapped page count decreased?

Thanks.

(Continue reading)

Rik van Riel | 1 May 2009 02:59
Picon
Favicon

Re: [PATCH] vmscan: evict use-once pages first (v2)

On Thu, 30 Apr 2009 17:45:36 -0700
Andrew Morton <akpm <at> linux-foundation.org> wrote:

> Were you able to tell whether altering /proc/sys/vm/swappiness
> appropriately regulated the rate at which the mapped page count
> decreased?

That should not make a difference at all for mapped file
pages, after the change was merged that makes the VM ignores
the referenced bit of mapped active file pages.

Ever since the split LRU code was merged, all that the
swappiness controls is the aggressiveness of file vs
anonymous LRU scanning.

Currently the kernel has no effective code to protect the 
page cache working set from streaming IO.  Elladan's bug
report shows that we do need some kind of protection...

--

-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo <at> kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont <at> kvack.org"> email <at> kvack.org </a>

Lee Schermerhorn | 1 May 2009 03:14
Picon
Favicon

Re: [BUG] 2.6.30-rc3-mmotm-090428-1814 -- bogus pointer deref

On Thu, 2009-04-30 at 12:31 +0100, Mel Gorman wrote:
> On Wed, Apr 29, 2009 at 04:34:59PM -0400, Lee Schermerhorn wrote:
> > I'm seeing this on an ia64 platform--HP rx8640--running the numactl
> > package regression test.  On ia64 a "NaT Consumption" [NaT = "not a
> > thing"] usually means a bogus pointer.  I verified that it also occurs
> > on 2.6.30-rc3-mmotm-090424-1814.  The regression test runs to completion
> > on a 4-node x86_64 platform for both the 04/27 and 04/28 mmotm kernels.
> > 
> > The bug occurs right after the test suite issues the message:
> > 
> > "testing numactl --interleave=all memhog 15728640"
> > 
> > -------------------------------
> > Console log:
> > 
> > numactl[7821]: NaT consumption 2216203124768 [2]
> > Modules linked in: ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc vfat fat dm_mirror dm_multipath
scsi_dh pci_slot parport_pc lp parport sg sr_mod cdrom button e1000 tg3 libphy dm_region_hash dm_log
dm_mod sym53c8xx mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd uhci_hcd
ohci_hcd ehci_hcd [last unloaded: freq_table]
> > 
> > Pid: 7821, CPU 25, comm:              numactl
> > psr : 0000121008022038 ifs : 8000000000000004 ip  : [<a00000010014ec91>]    Not tainted (2.6.30-rc3-mmotm-090428-1631)
> > ip is at next_zones_zonelist+0x31/0x120
<snip>
> > 
> > I'll try to bisect to specific patch--probably tomorrow.

Mel:  I think you can rest easy.  I've duplicated the problem with a
kernel that truncates the mmotm 04/28 series just before your patches.
(Continue reading)

Andrew Morton | 1 May 2009 03:13

Re: [PATCH] vmscan: evict use-once pages first (v2)

On Thu, 30 Apr 2009 20:59:36 -0400
Rik van Riel <riel <at> redhat.com> wrote:

> On Thu, 30 Apr 2009 17:45:36 -0700
> Andrew Morton <akpm <at> linux-foundation.org> wrote:
> 
> > Were you able to tell whether altering /proc/sys/vm/swappiness
> > appropriately regulated the rate at which the mapped page count
> > decreased?
> 
> That should not make a difference at all for mapped file
> pages, after the change was merged that makes the VM ignores
> the referenced bit of mapped active file pages.
> 
> Ever since the split LRU code was merged, all that the
> swappiness controls is the aggressiveness of file vs
> anonymous LRU scanning.

Which would cause exactly the problem Elladan saw?

> Currently the kernel has no effective code to protect the 
> page cache working set from streaming IO.  Elladan's bug
> report shows that we do need some kind of protection...

Seems to me that reclaim should treat swapcache-backed mapped mages in
a similar fashion to file-backed mapped pages?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo <at> kvack.org.  For more info on Linux MM,
(Continue reading)

Wu Fengguang | 1 May 2009 03:22
Picon
Favicon

Re: [patch 20/22] vmscan: avoid multiplication overflow in shrink_zone()

On Fri, May 01, 2009 at 06:08:55AM +0800, Andrew Morton wrote:
> 
> Local variable `scan' can overflow on zones which are larger than
> 
> 	(2G * 4k) / 100 = 80GB.
> 
> Making it 64-bit on 64-bit will fix that up.

A side note about the "one HUGE scan inside shrink_zone":

Isn't this low level scan granularity way tooooo large?

It makes things a lot worse on memory pressure:
- the over reclaim, somehow workarounded by Rik's early bail out patch
- the throttle_vm_writeout()/congestion_wait() guards could work in a
  very sparse manner and hence is useless: imagine to stop and wait
  after shooting away every 1GB memory.

The long term fix could be to move the granularity control up to the
shrink_zones() level: there it can bail out early without hurting the
balanced zone aging.

Thanks,
Fengguang

> --- a/mm/vmscan.c~vmscan-avoid-multiplication-overflow-in-shrink_zone
> +++ a/mm/vmscan.c
>  <at>  <at>  -1471,7 +1471,7  <at>  <at>  static void shrink_zone(int priority, st
>  
>  	for_each_evictable_lru(l) {
(Continue reading)

Rik van Riel | 1 May 2009 03:50
Picon
Favicon

Re: [PATCH] vmscan: evict use-once pages first (v2)

On Thu, 30 Apr 2009 18:13:40 -0700
Andrew Morton <akpm <at> linux-foundation.org> wrote:

> On Thu, 30 Apr 2009 20:59:36 -0400
> Rik van Riel <riel <at> redhat.com> wrote:
> 
> > On Thu, 30 Apr 2009 17:45:36 -0700
> > Andrew Morton <akpm <at> linux-foundation.org> wrote:
> > 
> > > Were you able to tell whether altering /proc/sys/vm/swappiness
> > > appropriately regulated the rate at which the mapped page count
> > > decreased?
> > 
> > That should not make a difference at all for mapped file
> > pages, after the change was merged that makes the VM ignores
> > the referenced bit of mapped active file pages.
> > 
> > Ever since the split LRU code was merged, all that the
> > swappiness controls is the aggressiveness of file vs
> > anonymous LRU scanning.
> 
> Which would cause exactly the problem Elladan saw?

Yes.  It was not noticable in the initial split LRU code,
but after we decided to ignore the referenced bit on active
file pages and deactivate pages regardless, it has gotten
exacerbated.

That change was very good for scalability, so we should not
undo it.  However, we do need to put something in place to
(Continue reading)

Andrew Morton | 1 May 2009 04:49

Re: [patch 20/22] vmscan: avoid multiplication overflow in shrink_zone()

On Fri, 1 May 2009 09:22:12 +0800 Wu Fengguang <fengguang.wu <at> intel.com> wrote:

> On Fri, May 01, 2009 at 06:08:55AM +0800, Andrew Morton wrote:
> > 
> > Local variable `scan' can overflow on zones which are larger than
> > 
> > 	(2G * 4k) / 100 = 80GB.
> > 
> > Making it 64-bit on 64-bit will fix that up.
> 
> A side note about the "one HUGE scan inside shrink_zone":
> 
> Isn't this low level scan granularity way tooooo large?
> 
> It makes things a lot worse on memory pressure:
> - the over reclaim, somehow workarounded by Rik's early bail out patch
> - the throttle_vm_writeout()/congestion_wait() guards could work in a
>   very sparse manner and hence is useless: imagine to stop and wait
>   after shooting away every 1GB memory.
> 
> The long term fix could be to move the granularity control up to the
> shrink_zones() level: there it can bail out early without hurting the
> balanced zone aging.
> 

I guess it could be bad in some circumstances.  Normally we'll bail out
way early because (nr_reclaimed > swap_cluster_max) comes true.  If it
_doesn't_ come true, we have little choice but to keep scanning.

The code is mystifying:
(Continue reading)

Andrew Morton | 1 May 2009 04:54

Re: [PATCH] vmscan: evict use-once pages first (v2)

On Thu, 30 Apr 2009 21:50:34 -0400 Rik van Riel <riel <at> redhat.com> wrote:

> > Which would cause exactly the problem Elladan saw?
> 
> Yes.  It was not noticable in the initial split LRU code,
> but after we decided to ignore the referenced bit on active
> file pages and deactivate pages regardless, it has gotten
> exacerbated.
> 
> That change was very good for scalability, so we should not
> undo it.  However, we do need to put something in place to
> protect the working set from streaming IO.
> 
> > > Currently the kernel has no effective code to protect the 
> > > page cache working set from streaming IO.  Elladan's bug
> > > report shows that we do need some kind of protection...
> > 
> > Seems to me that reclaim should treat swapcache-backed mapped mages in
> > a similar fashion to file-backed mapped pages?
> 
> Swapcache-backed pages are not on the same set of LRUs as
> file-backed mapped pages.

yup.

> Furthermore, there is no streaming IO on the anon LRUs like
> there is on the file LRUs. Only the file LRUs need (and want)
> use-once replacement, which means that we only need special
> protection of the working set for file-backed pages.

(Continue reading)

Elladan | 1 May 2009 05:09
Favicon

Re: [PATCH] vmscan: evict use-once pages first (v2)

On Thu, Apr 30, 2009 at 05:45:36PM -0700, Andrew Morton wrote:
> On Thu, 30 Apr 2009 00:20:58 -0700
> Elladan <elladan <at> eskimo.com> wrote:
> 
> > > Elladan, does this smaller patch still work as expected?
> > 
> > Rik, since the third patch doesn't work on 2.6.28 (without disabling a lot of
> > code), I went ahead and tested this patch.
> > 
> > The system does seem relatively responsive with this patch for the most part,
> > with occasional lag.  I don't see much evidence at least over the course of a
> > few minutes that it pages out applications significantly.  It seems about
> > equivalent to the first patch.
> > 
> > Given Andrew Morton's request that I track the Mapped: field in /proc/meminfo,
> > I went ahead and did that with this patch built into a kernel.  Compared to the
> > standard Ubuntu kernel, this patch keeps significantly more Mapped memory
> > around, and it shrinks at a slower rate after the test runs for a while.
> > Eventually, it seems to reach a steady state.
> > 
> > For example, with your patch, Mapped will often go for 30 seconds without
> > changing significantly.  Without your patch, it continuously lost about
> > 500-1000K every 5 seconds, and then jumped up again significantly when I
> > touched Firefox or other applications.  I do see some of that behavior with
> > your patch too, but it's much less significant.
> 
> Were you able to tell whether altering /proc/sys/vm/swappiness appropriately
> regulated the rate at which the mapped page count decreased?

I don't believe so.  I tested with swappiness=0 and =60, and in each case the
(Continue reading)

Magnus Damm | 1 May 2009 05:26
Picon

Re: [PATCH] videobuf-dma-contig: zero copy USERPTR support V2

On Tue, Apr 28, 2009 at 6:01 PM, Magnus Damm <magnus.damm <at> gmail.com> wrote:
> This is V2 of the V4L2 videobuf-dma-contig USERPTR zero copy patch.

I guess the V4L2 specific bits are pretty simple.

As for the minor mm modifications below,

> --- 0001/mm/memory.c
> +++ work/mm/memory.c    2009-04-28 14:56:43.000000000 +0900
>  <at>  <at>  -3009,7 +3009,6  <at>  <at>  int in_gate_area_no_task(unsigned long a
>
>  #endif /* __HAVE_ARCH_GATE_AREA */
>
> -#ifdef CONFIG_HAVE_IOREMAP_PROT
>  int follow_phys(struct vm_area_struct *vma,
>                unsigned long address, unsigned int flags,
>                unsigned long *prot, resource_size_t *phys)

Is it ok with the memory management guys to always build follow_phys()?

>  <at>  <at>  -3063,7 +3062,9  <at>  <at>  unlock:
>  out:
>        return ret;
>  }
> +EXPORT_SYMBOL(follow_phys);
>
> +#ifdef CONFIG_HAVE_IOREMAP_PROT
>  int generic_access_phys(struct vm_area_struct *vma, unsigned long addr,
>                        void *buf, int len, int write)
>  {
(Continue reading)


Gmane