Andrew Morton | 1 Dec 02:31 2006

Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated

On Thu, 30 Nov 2006 17:07:46 +0000
mel <at> skynet.ie (Mel Gorman) wrote:

> Am reporting this patch after there were no further comments on the last
> version.

Am not sure what to do with it - nothing actually uses __GFP_MOVABLE.

> It is often known at allocation time when a page may be migrated or not.

"often", yes.

> This
> page adds a flag called __GFP_MOVABLE and GFP_HIGH_MOVABLE. Allocations using
> the __GFP_MOVABLE can be either migrated using the page migration mechanism
> or reclaimed by syncing with backing storage and discarding.
> 
> Additional credit goes to Christoph Lameter and Linus Torvalds for shaping
> the concept. Credit to Hugh Dickens for catching issues with shmem swap
> vector and ramfs allocations.
>
> ...
> 
>  <at>  <at>  -65,7 +65,7  <at>  <at>  static inline void clear_user_highpage(s
>  static inline struct page *
>  alloc_zeroed_user_highpage(struct vm_area_struct *vma, unsigned long vaddr)
>  {
> -	struct page *page = alloc_page_vma(GFP_HIGHUSER, vma, vaddr);
> +	struct page *page = alloc_page_vma(GFP_HIGH_MOVABLE, vma, vaddr);
>  
(Continue reading)

Mel Gorman | 1 Dec 10:54 2006
Picon

Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated

On Thu, 30 Nov 2006, Andrew Morton wrote:

> On Thu, 30 Nov 2006 17:07:46 +0000
> mel <at> skynet.ie (Mel Gorman) wrote:
>
>> Am reporting this patch after there were no further comments on the last
>> version.
>
> Am not sure what to do with it - nothing actually uses __GFP_MOVABLE.
>

Nothing yet. To begin with, this is just a documentation mechanism. I'll 
be trying to push page clustering one piece at a time which will need 
this. The markings may also be of interest to containers and to pagesets 
because it will clearly flag what are allocations in use by userspace.

>> It is often known at allocation time when a page may be migrated or not.
>
> "often", yes.
>
>> This
>> page adds a flag called __GFP_MOVABLE and GFP_HIGH_MOVABLE. Allocations using
>> the __GFP_MOVABLE can be either migrated using the page migration mechanism
>> or reclaimed by syncing with backing storage and discarding.
>>
>> Additional credit goes to Christoph Lameter and Linus Torvalds for shaping
>> the concept. Credit to Hugh Dickens for catching issues with shmem swap
>> vector and ramfs allocations.
>>
>> ...
(Continue reading)

Aubrey | 1 Dec 11:00 2006
Picon

Re: The VFS cache is not freed when there is not enough free memory to allocate

On 12/1/06, Nick Piggin <nickpiggin <at> yahoo.com.au> wrote:
>
> The pattern you are seeing here is probably due to the page allocator
> always retrying process context allocations which are <= order 3 (64K
> with 4K pages).
>
> You might be able to increase this limit a bit for your system, but it
> could easily cause problems. Especially fragmentation on nommu systems
> where the anonymous memory cannot be paged out.

Thanks for your clue. I found increasing this limit could really help
my test cases.
When MemFree < 8M, and the test case request 1M * 8 times, the
allocation can be sucessful after 81 times rebalance, :). So far I
haven't found any issue.

If I make a patch to move this parameter to be tunable in the proc
filesystem on nommu case, is it acceptable?

Thanks,
-Aubrey
Peter Zijlstra | 1 Dec 12:28 2006
Picon

Re: [RFC][PATCH 1/6] mm: slab allocation fairness

On Thu, 2006-11-30 at 11:33 -0800, Christoph Lameter wrote:
> On Thu, 30 Nov 2006, Peter Zijlstra wrote:
> 
> > No, the forced allocation is to test the allocation hardness at that
> > point in time. I could not think of another way to test that than to
> > actually to an allocation.
> 
> Typically we do this by checking the number of free pages in a zone 
> compared to the high low limits. See mmzone.h.

This doesn't work under high load because of direct reclaim. And if I go
run direct reclaim to test if I can raise the free pages level to an
acceptable level for the given gfp flags, I might as well do the whole
allocation.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra | 1 Dec 13:14 2006
Picon

Re: [RFC][PATCH 5/6] slab: kmem_cache_objs_to_pages()

On Thu, 2006-11-30 at 10:55 -0800, Christoph Lameter wrote:
> On Thu, 30 Nov 2006, Peter Zijlstra wrote:
> 
> > +unsigned int kmem_cache_objs_to_pages(struct kmem_cache *cachep, int nr)
> > +{
> > +	return ((nr + cachep->num - 1) / cachep->num) << cachep->gfporder;
> 
> cachep->num refers to the number of objects in a slab of gfporder.
> 
> thus
> 
> return (nr + cachep->num - 1) / cachep->num;

No, that would give the number of slabs needed, I want pages.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Morton | 1 Dec 20:01 2006

Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated

On Fri, 1 Dec 2006 09:54:11 +0000 (GMT)
Mel Gorman <mel <at> csn.ul.ie> wrote:

> >>  <at>  <at>  -65,7 +65,7  <at>  <at>  static inline void clear_user_highpage(s
> >>  static inline struct page *
> >>  alloc_zeroed_user_highpage(struct vm_area_struct *vma, unsigned long vaddr)
> >>  {
> >> -	struct page *page = alloc_page_vma(GFP_HIGHUSER, vma, vaddr);
> >> +	struct page *page = alloc_page_vma(GFP_HIGH_MOVABLE, vma, vaddr);
> >>
> >>  	if (page)
> >>  		clear_user_highpage(page, vaddr);
> >
> > But this change is presumptuous.  alloc_zeroed_user_highpage() doesn't know
> > that its caller is going to use the page for moveable purposes.  (Ditto lots
> > of other places in this patch).
> >
> 
> according to grep -r, alloc_zeroed_user_highpage() is only used in two 
> places, do_wp_page() (when write faulting the zero page)[1] and 
> do_anonymous_page() (when mapping the zero page for the first time and 
> writing). In these cases, they are known to be movable. What am I missing?

We shouldn't implement a function which "knows" how its callers are using
it in this manner.

You've gone and changed alloc_zeroed_user_highpage() into alloc_user_zeroed_highpage_which_you_must_use_in_an_application_where_it_is_movable().
Now, if we want to put a big fat comment over these functions saying that the caller
must honour the promise we've made on the caller's behalf then OK(ish).  But it'd
be better (albeit perhaps bloaty) to require the caller to pass in the gfp-flags.
(Continue reading)

Nick Piggin | 2 Dec 13:15 2006
Picon

[rfc] possible page manipulation simplifications?

Hi,

While working in this area, I noticed a few things we do that may not
have a positive payoff under the most common conditions. Untested yet,
and probably needs a bit of instrumentation, but it saves about half a
K of code, lots of branches, and makes things look nicer. Any thoughts?

Quite a bit of code is used in maintaining these "cached pages" that are
probably pretty unlikely to get used.

Also, buffered write path (and others) uses its own LRU pagevec when we should
be just using the per-CPU LRU pagevec (which will cut down on both data and
code size cacheline footprint).

Index: linux-2.6/mm/filemap.c
===================================================================
--- linux-2.6.orig/mm/filemap.c
+++ linux-2.6/mm/filemap.c
 <at>  <at>  -686,26 +686,18  <at>  <at>  EXPORT_SYMBOL(find_lock_page);
 struct page *find_or_create_page(struct address_space *mapping,
 		unsigned long index, gfp_t gfp_mask)
 {
-	struct page *page, *cached_page = NULL;
+	struct page *page;
 	int err;
 repeat:
 	page = find_lock_page(mapping, index);
 	if (!page) {
-		if (!cached_page) {
-			cached_page = alloc_page(gfp_mask);
(Continue reading)

Mel Gorman | 4 Dec 15:07 2006
Picon

Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated

On (01/12/06 11:01), Andrew Morton didst pronounce:
> On Fri, 1 Dec 2006 09:54:11 +0000 (GMT)
> Mel Gorman <mel <at> csn.ul.ie> wrote:
> 
> > >>  <at>  <at>  -65,7 +65,7  <at>  <at>  static inline void clear_user_highpage(s
> > >>  static inline struct page *
> > >>  alloc_zeroed_user_highpage(struct vm_area_struct *vma, unsigned long vaddr)
> > >>  {
> > >> -	struct page *page = alloc_page_vma(GFP_HIGHUSER, vma, vaddr);
> > >> +	struct page *page = alloc_page_vma(GFP_HIGH_MOVABLE, vma, vaddr);
> > >>
> > >>  	if (page)
> > >>  		clear_user_highpage(page, vaddr);
> > >
> > > But this change is presumptuous.  alloc_zeroed_user_highpage() doesn't know
> > > that its caller is going to use the page for moveable purposes.  (Ditto lots
> > > of other places in this patch).
> > >
> > 
> > according to grep -r, alloc_zeroed_user_highpage() is only used in two 
> > places, do_wp_page() (when write faulting the zero page)[1] and 
> > do_anonymous_page() (when mapping the zero page for the first time and 
> > writing). In these cases, they are known to be movable. What am I missing?
> 
> We shouldn't implement a function which "knows" how its callers are using
> it in this manner.
> 

I see.

(Continue reading)

Mel Gorman | 4 Dec 15:40 2006
Picon

Re: [rfc] possible page manipulation simplifications?

On (02/12/06 13:15), Nick Piggin didst pronounce:
> Hi,
> 
> While working in this area, I noticed a few things we do that may not
> have a positive payoff under the most common conditions. Untested yet,
> and probably needs a bit of instrumentation, but it saves about half a
> K of code, lots of branches, and makes things look nicer. Any thoughts?
> 
> Quite a bit of code is used in maintaining these "cached pages" that are
> probably pretty unlikely to get used.
> 

I think you might be leaking now though. More comments below.

> Also, buffered write path (and others) uses its own LRU pagevec when we should
> be just using the per-CPU LRU pagevec (which will cut down on both data and
> code size cacheline footprint).
> 

Splitting the patch into two could be nice but it's grand for the
moment.

> Index: linux-2.6/mm/filemap.c
> ===================================================================
> --- linux-2.6.orig/mm/filemap.c
> +++ linux-2.6/mm/filemap.c
>  <at>  <at>  -686,26 +686,18  <at>  <at>  EXPORT_SYMBOL(find_lock_page);
>  struct page *find_or_create_page(struct address_space *mapping,
>  		unsigned long index, gfp_t gfp_mask)
>  {
(Continue reading)

Nick Piggin | 4 Dec 15:55 2006
Picon

Re: [rfc] possible page manipulation simplifications?

On Mon, Dec 04, 2006 at 02:40:05PM +0000, Mel Gorman wrote:
> On (02/12/06 13:15), Nick Piggin didst pronounce:
> > Hi,
> > 
> > While working in this area, I noticed a few things we do that may not
> > have a positive payoff under the most common conditions. Untested yet,
> > and probably needs a bit of instrumentation, but it saves about half a
> > K of code, lots of branches, and makes things look nicer. Any thoughts?
> > 
> > Quite a bit of code is used in maintaining these "cached pages" that are
> > probably pretty unlikely to get used.
> > 
> 
> I think you might be leaking now though. More comments below.
> 
> > Also, buffered write path (and others) uses its own LRU pagevec when we should
> > be just using the per-CPU LRU pagevec (which will cut down on both data and
> > code size cacheline footprint).
> > 
> 
> Splitting the patch into two could be nice but it's grand for the
> moment.

Hi Mel,

I think you're right about the leakage, thanks for catching it.

As far as allocating pages twice is concerned, I *strongly* believe
it is the wrong tradeoff to fix this with a "cached_page" because we
have to hit 2 reasonably rare races.
(Continue reading)


Gmane