Peter Zijlstra | 2 Jul 2006 20:44
Picon

[RFC][PATCH] mm: concurrent page-cache


Hi,

here my current attempts at a concurrent page-cache. It applies on top
if Nick's latest lockless page-cache patch. It is very much a
work-in-progress. It may eat your data. I post with the hopes of getting
some feedback.

I've ran it on a SMP qemu and found that it currently live-locks in
find_get_pages_tag(). It seems to be the case that a concurrent
radix_tree_tag_clear()/radix_tree_delete() removed the 'last' entry for
a gang lookup, which makes find_get_pages_tag() loop forever since it
requires progress.

If this is indeed the case, it seems to me the lockless page-cache patch
might be susceptible to this same problem.

I still have to investigate fully, but I might not have time to look
into it until after Ottawa.

Signed-off-by: Peter Zijlstra <a.p.zijlstra <at> chello.nl>

---
 fs/buffer.c                |    4 
 include/linux/page-flags.h |    2 
 include/linux/pagemap.h    |   21 +--
 include/linux/radix-tree.h |    6 
 lib/radix-tree.c           |  314 +++++++++++++++++++++++++++++++++------------
 mm/filemap.c               |   41 +++--
 mm/migrate.c               |   27 ++-
(Continue reading)

keith mannthey | 5 Jul 2006 23:26
Picon
Favicon

[Patch] convert i386 NUMA KVA space to bootmem

Hello Andrew,
  I posted this patch a while ago but I didn't get any feedback.  I
would like to submit this patch to your tree.  

  The patch itself addresses a long standing issue of booting with an
initrd on an i386 numa system.  Currently (and always) the numa kva area
is mapped into low memory by finding the end of low memory and moving
that mark down (thus creating space for the kva).  The issue with this
is that Grub loads initrds into this similar space so when the kernel
check the initrd it finds it outside max_low_pfn and disables it (it
thinks the initrd is not mapped into usable memory) thus initrd enabled
kernels can't boot i386 numa :(

  My solution to the problem just converts the numa kva area to use the
bootmem allocator to save it's area (instead of moving the end of low
memory).  Using bootmem allows the kva area to be mapped into more
diverse addresses (not just the end of low memory) and enables the kva
area to be mapped below the initrd if present. 

  I have tested this patch on numaq(no initrd) and summit(initrd) i386
numa based systems.  It was diffed on 2.6.17-git26 but should apply to
just about any recent kernel. 

Signed-off-by:  Keith Mannthey <kmannth <at> us.ibm.com>

Attachment (patch-2.6.17-numa-kva-v3): text/x-patch, 3331 bytes
Abu M. Muttalib | 7 Jul 2006 11:46
Favicon

Commenting out out_of_memory() function in __alloc_pages()

Hi,

I am getting the Out of memory.

To circumvent the problem, I have commented the call to "out_of_memory(),
and replaced "goto restart" with "goto nopage".

At "nopage:" lable I have added a call to "schedule()" and then "return
NULL" after "schedule()".

I tried the modified kernel with a test application, the test application is
mallocing memory in a loop. Unlike as expected the process gets killed. On
second run of the same application I am getting the page allocation failure
as expected but subsequently the system hangs.

I am attaching the test application and the log herewith.

I am getting this exception with kernel 2.6.13. With kernel
2.4.19-rmka7-pxa1 there was no problem.

Why its so? What can I do to alleviate the OOM problem?

Thanks in anticipation and regards,
Abu.
sh-3.00# ./test1

OOM Test: Counter = 0
....
(Continue reading)

Chase Venters | 7 Jul 2006 23:21

Re: Commenting out out_of_memory() function in __alloc_pages()

On Fri, 7 Jul 2006, Abu M. Muttalib wrote:

> Hi,
>
> I am getting the Out of memory.
>
> To circumvent the problem, I have commented the call to "out_of_memory(),
> and replaced "goto restart" with "goto nopage".
>
> At "nopage:" lable I have added a call to "schedule()" and then "return
> NULL" after "schedule()".

I wouldn't recommend gutting the oom killer...

> I tried the modified kernel with a test application, the test application is
> mallocing memory in a loop. Unlike as expected the process gets killed. On
> second run of the same application I am getting the page allocation failure
> as expected but subsequently the system hangs.
>
> I am attaching the test application and the log herewith.
>
> I am getting this exception with kernel 2.6.13. With kernel
> 2.4.19-rmka7-pxa1 there was no problem.
>
> Why its so? What can I do to alleviate the OOM problem?

First you should know what is causing them. Is an application leaking 
memory, or is the kernel leaking memory? "ps" can help you answer the 
first question, while "watch cat /proc/meminfo" can help you answer the 
second.
(Continue reading)

Mel Gorman | 8 Jul 2006 13:10
Picon

[PATCH 0/6] Sizing zones and holes in an architecture independent manner V8

This is V8 of the patchset to size zones and memory holes in an
architecture-independent manner.  The notable addition in this release is
accounting for mem_map as a memory hole as it is not reclaimable and the
optional account of the kernel image as a memory hole. This is to match the
existing behavior of x86_64.

Changelog since V7
o Rebase to 2.6.17-mm6
o Account for mem_map as a memory hole
o Adjust mem_map when arch independent zone-sizing is used and PFN 0 is in
  a memory hole not accounted for by ARCH_PFN_OFFSET

Changelog since V6
o MAX_ACTIVE_REGIONS is really maximum active regions, not MAX_ACTIVE_REGIONS-1
o MAX_ACTIVE_REGIONS is 256 unless the architecture specifically asks for
  a different number or MAX_NUMNODES is >= 32
o nr_nodemap_entries tracks the number of entries rather than terminating with
  end_pfn == 0
o Add number of documentation-related comments. Functions exposed by headers
  may potentially be picked up by kerneldoc
o Changed misleading zone_present_pages_in_node() name to
  zone_spanned_pages_in_node()
o Be a bit more verbose to help debugging when things go wrong.
o On x86_64, end_pfn_map now gets updated properly or ACPI tables get "lost"
o Signoffs added to patches 1 and 5 by Bob Picco related to contributions,
  fixes and reviews

Changelog since V5
o Add a missing #include to mm/mem_init.c
o Drop the verbose debugging part of the set
(Continue reading)

Mel Gorman | 8 Jul 2006 13:11
Picon

[PATCH 3/6] Have x86 use add_active_range() and free_area_init_nodes


Size zones and holes in an architecture independent manner for x86.

 Kconfig        |    8 +---
 kernel/setup.c |   19 +++------
 kernel/srat.c  |  100 +---------------------------------------------------
 mm/discontig.c |   65 +++++++--------------------------
 4 files changed, 25 insertions(+), 167 deletions(-)

Signed-off-by: Mel Gorman <mel <at> csn.ul.ie>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff
linux-2.6.17-mm6-102-powerpc_use_init_nodes/arch/i386/Kconfig linux-2.6.17-mm6-103-x86_use_init_nodes/arch/i386/Kconfig
--- linux-2.6.17-mm6-102-powerpc_use_init_nodes/arch/i386/Kconfig	2006-07-05
14:31:11.000000000 +0100
+++ linux-2.6.17-mm6-103-x86_use_init_nodes/arch/i386/Kconfig	2006-07-06 11:08:03.000000000 +0100
 <at>  <at>  -603,12 +603,10  <at>  <at>  config ARCH_SELECT_MEMORY_MODEL
 	def_bool y
 	depends on ARCH_SPARSEMEM_ENABLE

-source "mm/Kconfig"
+config ARCH_POPULATES_NODE_MAP
+	def_bool y

-config HAVE_ARCH_EARLY_PFN_TO_NID
-	bool
-	default y
-	depends on NUMA
+source "mm/Kconfig"

(Continue reading)

Mel Gorman | 8 Jul 2006 13:11
Picon

[PATCH 2/6] Have Power use add_active_range() and free_area_init_nodes()


Size zones and holes in an architecture independent manner for Power.

 powerpc/Kconfig   |    7 --
 powerpc/mm/mem.c  |   53 ++++++----------
 powerpc/mm/numa.c |  157 ++++---------------------------------------------
 ppc/Kconfig       |    3 
 ppc/mm/init.c     |   26 ++++----
 5 files changed, 56 insertions(+), 190 deletions(-)

Signed-off-by: Mel Gorman <mel <at> csn.ul.ie>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff
linux-2.6.17-mm6-101-add_free_area_init_nodes/arch/powerpc/Kconfig linux-2.6.17-mm6-102-powerpc_use_init_nodes/arch/powerpc/Kconfig
--- linux-2.6.17-mm6-101-add_free_area_init_nodes/arch/powerpc/Kconfig	2006-07-05
14:31:12.000000000 +0100
+++ linux-2.6.17-mm6-102-powerpc_use_init_nodes/arch/powerpc/Kconfig	2006-07-06
11:06:11.000000000 +0100
 <at>  <at>  -715,11 +715,10  <at>  <at>  config ARCH_SPARSEMEM_DEFAULT
 	def_bool y
 	depends on SMP && PPC_PSERIES

-source "mm/Kconfig"
-
-config HAVE_ARCH_EARLY_PFN_TO_NID
+config ARCH_POPULATES_NODE_MAP
 	def_bool y
-	depends on NEED_MULTIPLE_NODES
+
+source "mm/Kconfig"
(Continue reading)

Mel Gorman | 8 Jul 2006 13:11
Picon

[PATCH 1/6] Introduce mechanism for registering active regions of memory


This patch defines the structure to represent an active range of page
frames within a node in an architecture independent manner. Architectures
are expected to register active ranges of PFNs using add_active_range(nid,
start_pfn, end_pfn) and call free_area_init_nodes() passing the PFNs of
the end of each zone.

 include/linux/mm.h     |   45 +++
 include/linux/mmzone.h |   10 
 mm/page_alloc.c        |  557 ++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 586 insertions(+), 26 deletions(-)

Signed-off-by: Mel Gorman <mel <at> csn.ul.ie>
Signed-off-by: Bob Picco <bob.picco <at> hp.com>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-mm6-clean/include/linux/mm.h linux-2.6.17-mm6-101-add_free_area_init_nodes/include/linux/mm.h
--- linux-2.6.17-mm6-clean/include/linux/mm.h	2006-07-05 14:31:17.000000000 +0100
+++ linux-2.6.17-mm6-101-add_free_area_init_nodes/include/linux/mm.h	2006-07-06
11:04:22.000000000 +0100
 <at>  <at>  -960,6 +960,51  <at>  <at>  extern void free_area_init(unsigned long
 extern void free_area_init_node(int nid, pg_data_t *pgdat,
 	unsigned long * zones_size, unsigned long zone_start_pfn, 
 	unsigned long *zholes_size);
+#ifdef CONFIG_ARCH_POPULATES_NODE_MAP
+/*
+ * With CONFIG_ARCH_POPULATES_NODE_MAP set, an architecture may initialise its
+ * zones, allocate the backing mem_map and account for memory holes in a more
+ * architecture independent manner. This is a substitute for creating the
+ * zone_sizes[] and zholes_size[] arrays and passing them to
+ * free_area_init_node()
(Continue reading)

Mel Gorman | 8 Jul 2006 13:12
Picon

[PATCH 4/6] Have x86_64 use add_active_range() and free_area_init_nodes


Size zones and holes in an architecture independent manner for x86_64.

 arch/x86_64/Kconfig         |    3 
 arch/x86_64/kernel/e820.c   |  125 ++++++++++++++-------------------------
 arch/x86_64/kernel/setup.c  |    7 +-
 arch/x86_64/mm/init.c       |   62 -------------------
 arch/x86_64/mm/k8topology.c |    3 
 arch/x86_64/mm/numa.c       |   18 ++---
 arch/x86_64/mm/srat.c       |   11 ++-
 include/asm-x86_64/e820.h   |    5 -
 include/asm-x86_64/proto.h  |    2 
 9 files changed, 79 insertions(+), 157 deletions(-)

Signed-off-by: Mel Gorman <mel <at> csn.ul.ie>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff
linux-2.6.17-mm6-103-x86_use_init_nodes/arch/x86_64/Kconfig linux-2.6.17-mm6-104-x86_64_use_init_nodes/arch/x86_64/Kconfig
--- linux-2.6.17-mm6-103-x86_use_init_nodes/arch/x86_64/Kconfig	2006-07-05
14:31:12.000000000 +0100
+++ linux-2.6.17-mm6-104-x86_64_use_init_nodes/arch/x86_64/Kconfig	2006-07-06
11:09:46.000000000 +0100
 <at>  <at>  -81,6 +81,9  <at>  <at>  config ARCH_MAY_HAVE_PC_FDC
 	bool
 	default y

+config ARCH_POPULATES_NODE_MAP
+	def_bool y
+
 config DMI
(Continue reading)

Mel Gorman | 8 Jul 2006 13:12
Picon

[PATCH 5/6] Have ia64 use add_active_range() and free_area_init_nodes


Size zones and holes in an architecture independent manner for ia64.

 arch/ia64/Kconfig          |    3 ++
 arch/ia64/mm/contig.c      |   60 +++++-----------------------------------
 arch/ia64/mm/discontig.c   |   41 ++++-----------------------
 arch/ia64/mm/init.c        |   12 ++++++++
 include/asm-ia64/meminit.h |    1 
 5 files changed, 30 insertions(+), 87 deletions(-)

Signed-off-by: Mel Gorman <mel <at> csn.ul.ie>
Signed-off-by: Bob Picco <bob.picco <at> hp.com>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff
linux-2.6.17-mm6-104-x86_64_use_init_nodes/arch/ia64/Kconfig linux-2.6.17-mm6-105-ia64_use_init_nodes/arch/ia64/Kconfig
--- linux-2.6.17-mm6-104-x86_64_use_init_nodes/arch/ia64/Kconfig	2006-07-05
14:31:11.000000000 +0100
+++ linux-2.6.17-mm6-105-ia64_use_init_nodes/arch/ia64/Kconfig	2006-07-06 11:11:30.000000000 +0100
 <at>  <at>  -361,6 +361,9  <at>  <at>  config NODES_SHIFT
 	  MAX_NUMNODES will be 2^(This value).
 	  If in doubt, use the default.

+config ARCH_POPULATES_NODE_MAP
+	def_bool y
+
 # VIRTUAL_MEM_MAP and FLAT_NODE_MEM_MAP are functionally equivalent.
 # VIRTUAL_MEM_MAP has been retained for historical reasons.
 config VIRTUAL_MEM_MAP
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff
linux-2.6.17-mm6-104-x86_64_use_init_nodes/arch/ia64/mm/contig.c linux-2.6.17-mm6-105-ia64_use_init_nodes/arch/ia64/mm/contig.c
(Continue reading)


Gmane