Alexander Beregalov | 1 Nov 2008 15:23
Picon

2.6.28-rc2: Unable to handle kernel paging request at iov_iter_copy_from_user_atomic

 2.6.28-rc2-00452-gf891caf on sparc64

How to reproduce: run dbench on tmpfs

Unable to handle kernel paging request at virtual address fffff80037c1c000
tsk->{mm,active_mm}->context = 0000000000001ae7
tsk->{mm,active_mm}->pgd = fffff8000ec8c000
              \|/ ____ \|/
              " <at> '/ .. \` <at> "
              /_| \__/ |_\
                 \__U_/
dbench(5007): Oops [#1]
TSTATE: 0000000011009604 TPC: 00000000005acbac TNPC: 00000000005acbb0
Y: 00000000    Not tainted
TPC: <__bzero+0x20/0xc0>
g0: 0000000000000016 g1: 0000000000000000 g2: 0000000000000000 g3:
0000000000033ae7
g4: fffff8000ec9c380 g5: 0000000000000020 g6: fffff8003b834000 g7:
ffffffffffffe8b1
o0: fffff80037c1c8b1 o1: 00000000000008b1 o2: 0000000000000000 o3:
fffff80037c1c8b1
o4: 0000000000000000 o5: 0000000000034398 sp: fffff8003b836e41 ret_pc:
00000000005ae73c
RPC: <copy_from_user_fixup+0x4c/0x70>
l0: 0000000000852800 l1: 0000000011009603 l2: 0000000000827ff4 l3:
0000000000000400
l4: 0000000000000000 l5: 0000000000000001 l6: 0000000000000000 l7:
0000000000000008
i0: fffff80037c1e000 i1: 0000000000032398 i2: 00000000000008b1 i3:
fffff80037c3e398
(Continue reading)

Balbir Singh | 1 Nov 2008 19:48
Picon

[mm][PATCH 0/4] Memory cgroup hierarchy introduction

This patch follows several iterations of the memory controller hierarchy
patches. The hardwall approach by Kamezawa-San[1]. Version 1 of this patchset
at [2].

The current approach is based on [2] and has the following properties

1. Hierarchies are very natural in a filesystem like the cgroup filesystem.
   A multi-tree hierarchy has been supported for a long time in filesystems.
   When the feature is turned on, we honor hierarchies such that the root
   accounts for resource usage of all children and limits can be set at
   any point in the hierarchy. Any memory cgroup is limited by limits
   along the hierarchy. The total usage of all children of a node cannot
   exceed the limit of the node.
2. The hierarchy feature is selectable and off by default
3. Hierarchies are expensive and the trade off is depth versus performance.
   Hierarchies can also be completely turned off.

The patches are against 2.6.28-rc2-mm1 and were tested in a KVM instance
with SMP and swap turned on.

Signed-off-by: Balbir Singh <balbir <at> linux.vnet.ibm.com>

Series
------

memcg-hierarchy-documentation.patch
resource-counters-hierarchy-support.patch
memcg-hierarchical-reclaim.patch
memcg-add-hierarchy-selector.patch

(Continue reading)

Balbir Singh | 1 Nov 2008 19:48
Picon

[mm] [PATCH 1/4] Memory cgroup hierarchy documentation


Documentation updates for hierarchy support

Signed-off-by: Balbir Singh <balbir <at> linux.vnet.ibm.com>
---

 Documentation/controllers/memory.txt |   34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff -puN Documentation/controllers/memory.txt~memcg-hierarchy-documentation Documentation/controllers/memory.txt
---
linux-2.6.28-rc2/Documentation/controllers/memory.txt~memcg-hierarchy-documentation	2008-11-02
00:14:54.000000000 +0530
+++ linux-2.6.28-rc2-balbir/Documentation/controllers/memory.txt	2008-11-02
00:14:54.000000000 +0530
 <at>  <at>  -245,6 +245,40  <at>  <at>  cgroup might have some charge associated
 tasks have migrated away from it. Such charges are automatically dropped at
 rmdir() if there are no tasks.

+5. Hierarchy support
+
+The memory controller supports a deep hierarchy and hierarchical accounting.
+The hierarchy is created by creating the appropriate cgroups in the
+cgroup filesystem. Consider for example, the following cgroup filesystem
+hierarchy
+
+		root
+	     /  |   \
+           /	|    \
+	  a	b	c
(Continue reading)

Balbir Singh | 1 Nov 2008 19:48
Picon

[mm] [PATCH 2/4] Memory cgroup resource counters for hierarchy


Add support for building hierarchies in resource counters. Cgroups allows us
to build a deep hierarchy, but we currently don't link the resource counters
belonging to the memory controller control groups, which are linked in
cgroup hiearchy. This patch provides the infrastructure for resource counters
that have the same hiearchy as their cgroup counter parts.

These set of patches are based on the resource counter hiearchy patches posted
by Pavel Emelianov.

NOTE: Building hiearchies is expensive, deeper hierarchies imply charging
the all the way up to the root. It is known that hiearchies are expensive,
so the user needs to be careful and aware of the trade-offs before creating
very deep ones.

Signed-off-by: Balbir Singh <balbir <at> linux.vnet.ibm.com>
---

 include/linux/res_counter.h |    8 ++++++--
 kernel/res_counter.c        |   42 ++++++++++++++++++++++++++++++++++--------
 mm/memcontrol.c             |    9 ++++++---
 3 files changed, 46 insertions(+), 13 deletions(-)

diff -puN include/linux/res_counter.h~resource-counters-hierarchy-support include/linux/res_counter.h
---
linux-2.6.28-rc2/include/linux/res_counter.h~resource-counters-hierarchy-support	2008-11-02
00:14:58.000000000 +0530
+++ linux-2.6.28-rc2-balbir/include/linux/res_counter.h	2008-11-02 00:14:58.000000000 +0530
 <at>  <at>  -43,6 +43,10  <at>  <at>  struct res_counter {
 	 * the routines below consider this to be IRQ-safe
(Continue reading)

Balbir Singh | 1 Nov 2008 19:48
Picon

[mm] [PATCH 3/4] Memory cgroup hierarchical reclaim


This patch introduces hierarchical reclaim. When an ancestor goes over its
limit, the charging routine points to the parent that is above its limit.
The reclaim process then starts from the last scanned child of the ancestor
and reclaims until the ancestor goes below its limit.

Signed-off-by: Balbir Singh <balbir <at> linux.vnet.ibm.com>
---

 mm/memcontrol.c |  153 +++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 129 insertions(+), 24 deletions(-)

diff -puN mm/memcontrol.c~memcg-hierarchical-reclaim mm/memcontrol.c
--- linux-2.6.28-rc2/mm/memcontrol.c~memcg-hierarchical-reclaim	2008-11-02
00:14:59.000000000 +0530
+++ linux-2.6.28-rc2-balbir/mm/memcontrol.c	2008-11-02 00:14:59.000000000 +0530
 <at>  <at>  -132,6 +132,11  <at>  <at>  struct mem_cgroup {
 	 * statistics.
 	 */
 	struct mem_cgroup_stat stat;
+	/*
+	 * While reclaiming in a hiearchy, we cache the last child we
+	 * reclaimed from.
+	 */
+	struct mem_cgroup *last_scanned_child;
 };
 static struct mem_cgroup init_mem_cgroup;

 <at>  <at>  -467,6 +472,125  <at>  <at>  unsigned long mem_cgroup_isolate_pages(u
 	return nr_taken;
(Continue reading)

Balbir Singh | 1 Nov 2008 19:49
Picon

[mm] [PATCH 4/4] Memory cgroup hierarchy feature selector


Don't enable multiple hierarchy support by default. This patch introduces
a features element that can be set to enable the nested depth hierarchy
feature. This feature can only be enabled when there is just one cgroup
(the root cgroup).

Signed-off-by: Balbir Singh <balbir <at> linux.vnet.ibm.com>
---

 mm/memcontrol.c |   38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff -puN mm/memcontrol.c~memcg-add-hierarchy-selector mm/memcontrol.c
--- linux-2.6.28-rc2/mm/memcontrol.c~memcg-add-hierarchy-selector	2008-11-02
00:15:00.000000000 +0530
+++ linux-2.6.28-rc2-balbir/mm/memcontrol.c	2008-11-02 00:15:00.000000000 +0530
 <at>  <at>  -40,6 +40,9  <at>  <at> 
 struct cgroup_subsys mem_cgroup_subsys __read_mostly;
 #define MEM_CGROUP_RECLAIM_RETRIES	5

+static unsigned long mem_cgroup_features;
+#define MEM_CGROUP_FEAT_HIERARCHY	0x1
+
 /*
  * Statistics for memory cgroup.
  */
 <at>  <at>  -1080,6 +1083,31  <at>  <at>  out:
 	return ret;
 }

(Continue reading)

Hugh Dickins | 1 Nov 2008 19:59

Re: 2.6.28-rc2: Unable to handle kernel paging request at iov_iter_copy_from_user_atomic

On Sat, 1 Nov 2008, Alexander Beregalov wrote:
>  2.6.28-rc2-00452-gf891caf on sparc64
> 
> How to reproduce: run dbench on tmpfs
> 
> 
> Unable to handle kernel paging request at virtual address fffff80037c1c000
> tsk->{mm,active_mm}->context = 0000000000001ae7
> tsk->{mm,active_mm}->pgd = fffff8000ec8c000
>               \|/ ____ \|/
>               " <at> '/ .. \` <at> "
>               /_| \__/ |_\
>                  \__U_/
> dbench(5007): Oops [#1]
> TSTATE: 0000000011009604 TPC: 00000000005acbac TNPC: 00000000005acbb0
> Y: 00000000    Not tainted
> TPC: <__bzero+0x20/0xc0>
> g0: 0000000000000016 g1: 0000000000000000 g2: 0000000000000000 g3:
> 0000000000033ae7
> g4: fffff8000ec9c380 g5: 0000000000000020 g6: fffff8003b834000 g7:
> ffffffffffffe8b1
> o0: fffff80037c1c8b1 o1: 00000000000008b1 o2: 0000000000000000 o3:
> fffff80037c1c8b1
> o4: 0000000000000000 o5: 0000000000034398 sp: fffff8003b836e41 ret_pc:
> 00000000005ae73c
> RPC: <copy_from_user_fixup+0x4c/0x70>
> l0: 0000000000852800 l1: 0000000011009603 l2: 0000000000827ff4 l3:
> 0000000000000400
> l4: 0000000000000000 l5: 0000000000000001 l6: 0000000000000000 l7:
> 0000000000000008
(Continue reading)

Alexander Beregalov | 2 Nov 2008 05:02
Picon

Re: 2.6.28-rc2: Unable to handle kernel paging request at iov_iter_copy_from_user_atomic

2008/11/1 Hugh Dickins <hugh <at> veritas.com>:
> On Sat, 1 Nov 2008, Alexander Beregalov wrote:
>>  2.6.28-rc2-00452-gf891caf on sparc64
>>
>> How to reproduce: run dbench on tmpfs
>>
>>
>> Unable to handle kernel paging request at virtual address fffff80037c1c000
>> tsk->{mm,active_mm}->context = 0000000000001ae7
>> tsk->{mm,active_mm}->pgd = fffff8000ec8c000
>>               \|/ ____ \|/
>>               " <at> '/ .. \` <at> "
>>               /_| \__/ |_\
>>                  \__U_/
>> dbench(5007): Oops [#1]
>> TSTATE: 0000000011009604 TPC: 00000000005acbac TNPC: 00000000005acbb0
>> Y: 00000000    Not tainted
>> TPC: <__bzero+0x20/0xc0>
>> g0: 0000000000000016 g1: 0000000000000000 g2: 0000000000000000 g3:
>> 0000000000033ae7
>> g4: fffff8000ec9c380 g5: 0000000000000020 g6: fffff8003b834000 g7:
>> ffffffffffffe8b1
>> o0: fffff80037c1c8b1 o1: 00000000000008b1 o2: 0000000000000000 o3:
>> fffff80037c1c8b1
>> o4: 0000000000000000 o5: 0000000000034398 sp: fffff8003b836e41 ret_pc:
>> 00000000005ae73c
>> RPC: <copy_from_user_fixup+0x4c/0x70>
>> l0: 0000000000852800 l1: 0000000011009603 l2: 0000000000827ff4 l3:
>> 0000000000000400
>> l4: 0000000000000000 l5: 0000000000000001 l6: 0000000000000000 l7:
(Continue reading)

David Miller | 2 Nov 2008 05:42
Favicon

Re: 2.6.28-rc2: Unable to handle kernel paging request at iov_iter_copy_from_user_atomic

From: Hugh Dickins <hugh <at> veritas.com>
Date: Sat, 1 Nov 2008 18:59:24 +0000 (GMT)

> Alexander Beregalov reports oops in __bzero() called from
> copy_from_user_fixup() called from iov_iter_copy_from_user_atomic(),
> when running dbench on tmpfs on sparc64: its __copy_from_user_inatomic
> and __copy_to_user_inatomic should be avoiding, not calling, the fixups.
> 
> Signed-off-by: Hugh Dickins <hugh <at> veritas.com>

This looks great, applied, thanks Hugh.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller | 2 Nov 2008 05:42
Favicon

Re: 2.6.28-rc2: Unable to handle kernel paging request at iov_iter_copy_from_user_atomic

From: "Alexander Beregalov" <a.beregalov <at> gmail.com>
Date: Sun, 2 Nov 2008 07:02:40 +0300

> Should we fix also sparc32?

Sparc32 uses a different scheme for all of this stuff.
I don't think your patch will even compile.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane