Magnus Damm | 1 Oct 2005 02:32
Picon

Re: [PATCH 05/07] i386: sparsemem on pc

On 10/1/05, Dave Hansen <haveblue <at> us.ibm.com> wrote:
> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> > This patch for enables and fixes sparsemem support on i386. This is the
> > same patch that was sent to linux-kernel on September 6:th 2005, but this
> > patch includes up-porting to fit on top of the patches written by Dave Hansen.
>
> I'll post a more comprehensive way to do this in just a moment.
>
>         Subject: memhotplug testing: hack for flat systems

Looks much better, will compile and test on Monday. Thanks.

/ magnus
Bill Davidsen | 1 Oct 2005 04:41

Re: [PATCH 0/7] CART - an advanced page replacement policy

Peter Zijlstra wrote:

>On Thu, 2005-09-29 at 15:40 -0400, Bill Davidsen wrote:
>  
>
>>Peter Zijlstra wrote:
>>    
>>
>>>Multiple memory zone CART implementation for Linux.
>>>An advanced page replacement policy.
>>>
>>>http://www.almaden.ibm.com/cs/people/dmodha/clockfast.pdf
>>>(IBM does hold patent rights to the base algorithm ARC)
>>>      
>>>
>>Peter, this is a large patch, perhaps you could describe what configs 
>>benefit, 
>>    
>>
>
>All those that use swap. Those that exploit the weak side of LRU more
>than others.
>
>CART is an adaptive algorithm that will act like LFU on one side and LRU
>on the other, capturing both behaviours. Therefore it is also scan
>proof, eg. 'use once' scans should not flush the full cache.
>
>Hence people with LFU friendly applications will see an improvement
>while those who have an LRU friendly application should see no decrease
>in swap performance.
(Continue reading)

Seth, Rohit | 1 Oct 2005 21:00
Picon
Favicon

[PATCH]: Clean up of __alloc_pages


	[PATCH]: Below is the cleaning up of __alloc_pages code.  Few 
		 things different from original version are

	1: remove the initial direct reclaim logic 
	2: order zero pages are now first looked into pcp list upfront
	3: GFP_HIGH pages are allowed to go little below low watermark sooner
	4: Search for free pages unconditionally after direct reclaim

	Signed-off-by: Rohit Seth <rohit.seth <at> intel.com>

--- linux-2.6.14-rc2-mm1.org/mm/page_alloc.c	2005-09-27 10:03:51.000000000 -0700
+++ linux-2.6.14-rc2-mm1/mm/page_alloc.c	2005-10-01 10:40:06.000000000 -0700
 <at>  <at>  -722,7 +722,8  <at>  <at> 
  * or two.
  */
 static struct page *
-buffered_rmqueue(struct zone *zone, int order, unsigned int __nocast gfp_flags)
+buffered_rmqueue(struct zone *zone, int order, unsigned int __nocast gfp_flags,
+			int replenish)
 {
 	unsigned long flags;
 	struct page *page = NULL;
 <at>  <at>  -733,7 +734,7  <at>  <at> 

 		pcp = &zone_pcp(zone, get_cpu())->pcp[cold];
 		local_irq_save(flags);
-		if (pcp->count <= pcp->low)
+		if ((pcp->count <= pcp->low) && replenish)
 			pcp->count += rmqueue_bulk(zone, 0,
(Continue reading)

Nick Piggin | 2 Oct 2005 05:09
Picon

Re: [PATCH]: Clean up of __alloc_pages

Seth, Rohit wrote:
> 	[PATCH]: Below is the cleaning up of __alloc_pages code.  Few 
> 		 things different from original version are
> 
> 	1: remove the initial direct reclaim logic 
> 	2: order zero pages are now first looked into pcp list upfront
> 	3: GFP_HIGH pages are allowed to go little below low watermark sooner
> 	4: Search for free pages unconditionally after direct reclaim
> 
> 	Signed-off-by: Rohit Seth <rohit.seth <at> intel.com>
> 

Hi,

Seems pretty good at a quick glance.

Perhaps splitting it into 2 would be a good idea - ie. first
patch does the cleanup, second does the direct pcp list alloc.

Regarding the direct pcp list allocation - I think it is a good
idea, because we're currently already accounting pcp list pages
as being 'allocated' for the purposes of the reclaim watermarks.

Also, the structure is there to avoid touching cachelines whenever
possible so it makes sense to use it early here. Do you have any
performance numbers or allocation statistics (e.g. %pcp hits) to
show?

Also, I would really think about uninlining get_page_from_freelist,
and inlining buffered_rmqueue, so that the constant 'replenish'
(Continue reading)

Bharata B Rao | 2 Oct 2005 18:32
Picon

Re: VM balancing issues on 2.6.13: dentry cache not getting shrunk enough

On Thu, Sep 15, 2005 at 10:29:10AM -0300, Marcelo Tosatti wrote:
> On Thu, Sep 15, 2005 at 03:09:45PM +0530, Bharata B Rao wrote:
> > On Wed, Sep 14, 2005 at 08:08:43PM -0300, Marcelo Tosatti wrote:
> > > On Tue, Sep 13, 2005 at 02:17:52PM +0530, Bharata B Rao wrote:
> > > > 
> > <snip>
> > > > First is dentry_stats patch which collects some dcache statistics
> > > > and puts it into /proc/meminfo. This patch provides information 
> > > > about how dentries are distributed in dcache slab pages, how many
> > > > free and in use dentries are present in dentry_unused lru list and
> > > > how prune_dcache() performs with respect to freeing the requested
> > > > number of dentries.
> > > 
> > > Bharata, 
> > > 
> > > Ideally one should move the "nr_requested/nr_freed" counters from your
> > > stats patch into "struct shrinker" (or somewhere else more appropriate
> > > in which per-shrinkable-cache stats are maintained), and use the
> > > "mod_page_state" infrastructure to do lockless per-CPU accounting. ie.
> > > break /proc/vmstats's "slabs_scanned" apart in meaningful pieces.
> > 
> > Yes, I agree that we should have the nr_requested and nr_freed type of
> > counters in appropriate place. And "struct shrinker" is probably right
> > place for it.
> > 
> > Essentially you are suggesting that we maintain per cpu statistics
> > of 'requested to free'(scanned) slab objects and actual freed objects.
> > And this should be on per shrinkable cache basis.
> 
> Yep. 
(Continue reading)

Marcelo | 2 Oct 2005 22:06
Picon

Re: VM balancing issues on 2.6.13: dentry cache not getting shrunk enough


Bharata,

On Sun, Oct 02, 2005 at 10:02:29PM +0530, Bharata B Rao wrote:
> 
> Marcelo,
> 
> The attached patch is an attempt to break the "slabs_scanned" into
> meaningful pieces as you suggested.
> 
> But I coudn't do this cleanly because kmem_cache_t isn't defined
> in a .h file and I didn't want to touch too many files in the first
> attempt.
> 
> What I am doing here is making the "requested to free" and
> "actual freed" counters as part of struct shrinker. With this I can
> update these statistics seamlessly from shrink_slab().
> 
> I don't have this as per cpu counters because I wasn't sure if shrink_slab()
> would have many concurrent executions warranting a lockless percpu
> counters for these.

Per-CPU counters are interesting because they avoid the atomic
operation _and_ potential cacheline bouncing. Given the fact that less
commonly used counters in the reclaim path are already per-CPU,
I think that it might be worth to do it here too.

> I am displaying this information as part of /proc/slabinfo and I have
> verified that it atleast isn't breaking slabtop.
> 
(Continue reading)

Magnus Damm | 3 Oct 2005 04:08
Picon

Re: [PATCH 00/07][RFC] i386: NUMA emulation

On 10/1/05, Dave Hansen <haveblue <at> us.ibm.com> wrote:
> On Fri, 2005-09-30 at 16:33 +0900, Magnus Damm wrote:
> > These patches implement NUMA memory node emulation for regular i386 PC:s.
> >
> > NUMA emulation could be used to provide coarse-grained memory resource control
> > using CPUSETS. Another use is as a test environment for NUMA memory code or
> > CPUSETS using an i386 emulator such as QEMU.
>
> This patch set basically allows the "NUMA depends on SMP" dependency to
> be removed.  I'm not sure this is the right approach.  There will likely
> never be a real-world NUMA system without SMP.  So, this set would seem
> to include some increased (#ifdef) complexity for supporting SMP && !
> NUMA, which will likely never happen in the real world.

Yes, this patch set removes "NUMA depends on SMP". It also adds some
simple NUMA emulation code too, but I am sure you are aware of that!
=)

I agree that it is very unlikely to find a single-processor NUMA
system in the real world. So yes, "[PATCH 02/07] i386: numa on
non-smp" adds _some_ extra complexity. But because SMP is set when
supporting more than one cpu, and NUMA is set when supporting more
than one memory node, I see no reason why they should be dependent on
each other. Except that they depend on each other today and breaking
them loose will increase complexity a bit.

> Also, I worry that simply #ifdef'ing things out like CPUsets' update
> means that CPUsets lacks some kind of abstraction that it should have
> been using in the first place.  An #ifdef just papers over the real
> problem.
(Continue reading)

Paul Jackson | 3 Oct 2005 05:21
Picon
Favicon

Re: [PATCH 00/07][RFC] i386: NUMA emulation

Dave wrote:
> Also, I worry that simply #ifdef'ing things out like CPUsets' update
> means that CPUsets lacks some kind of abstraction that it should have
> been using in the first place. 

In the abstract, cpusets should just assume that the system has one or
more CPUs, and one or more Memory Nodes.  Ideally, it should not
require either SMP nor NUMA.  Indeed, if you (Magnus) can get it
to compile with just one or the other of those two:

     config CPUSETS
	    bool "Cpuset support"
    -       depends on SMP
    +       depends on SMP || NUMA

then I would hope that it would compile with neither.  The cpuset
hierarchy on such a system would be rather boring, with all cpusets
having the same one CPU and one Memory Node, but it should work ... in
theory of course.

In practice of course, there may be details on the edges that depend on
the current SMP/NUMA limitations, such as:

Magnus wrote:
> Regarding the #ifdef, it
> was added because partition_sched_domain() is only implemented for
> SMP. That symbol has no prototype or implementation when CONFIG_SMP is
> not set. Maybe it is better to add an empty inline function in
> linux/sched.h for !SMP?

(Continue reading)

Magnus Damm | 3 Oct 2005 07:05
Picon

Re: [PATCH 00/07][RFC] i386: NUMA emulation

On 10/3/05, Paul Jackson <pj <at> sgi.com> wrote:
> Dave wrote:
> > Also, I worry that simply #ifdef'ing things out like CPUsets' update
> > means that CPUsets lacks some kind of abstraction that it should have
> > been using in the first place.
>
> In the abstract, cpusets should just assume that the system has one or
> more CPUs, and one or more Memory Nodes.  Ideally, it should not
> require either SMP nor NUMA.  Indeed, if you (Magnus) can get it
> to compile with just one or the other of those two:
>
>      config CPUSETS
>             bool "Cpuset support"
>     -       depends on SMP
>     +       depends on SMP || NUMA
>
> then I would hope that it would compile with neither.  The cpuset
> hierarchy on such a system would be rather boring, with all cpusets
> having the same one CPU and one Memory Node, but it should work ... in
> theory of course.

I just tested this on top of my patches:
 <at>  <at>  -245,7 +245,6  <at>  <at>  config IKCONFIG_PROC

 config CPUSETS
        bool "Cpuset support"
-       depends on SMP || NUMA
        help

and it seems to work ok in practice too. On a regular !SMP !NUMA PC
(Continue reading)

Yasunori Goto | 3 Oct 2005 07:19
Favicon

Re: [PATCH]Remove pgdat list ver.2 [1/2]

> This works around my compile problem for now.  But, it might cause some
> more issues.  Can you take a closer look?

It works well in my ia64 box.
But, I have not understood why this patch moves also the lines from
is_highmem_idx() to lowmem_reserve_ratio_sysctl_handler() yet.
Is it necessary?
If no, the patch becomes a bit smaller. :-)

Thanks.

--

-- 
Yasunori Goto 

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane