Re: mmotm 2015-01-22-15-04: qemu failure due to 'mm: memcontrol: remove unnecessary soft limit tree node test'

From: Guenter Roeck
Date: Fri Jan 23 2015 - 10:47:18 EST


On 01/23/2015 06:18 AM, Johannes Weiner wrote:
Hi Guenter,

CC'ing Christoph for slub-stuff:

On Thu, Jan 22, 2015 at 09:08:02PM -0800, Guenter Roeck wrote:
On Thu, Jan 22, 2015 at 03:05:17PM -0800, akpm@xxxxxxxxxxxxxxxxxxxx wrote:
The mm-of-the-moment snapshot 2015-01-22-15-04 has been uploaded to

http://www.ozlabs.org/~akpm/mmotm/

qemu test for ppc64 fails with

Unable to handle kernel paging request for data at address 0x0000af50
Faulting instruction address: 0xc00000000089d5d4
Oops: Kernel access of bad area, sig: 11 [#1]

with the following call stack:

Call Trace:
[c00000003d32f920] [c00000000089d588] .__slab_alloc.isra.44+0x7c/0x6f4
(unreliable)
[c00000003d32fa90] [c00000000020cf8c] .kmem_cache_alloc_node_trace+0x12c/0x3b0
[c00000003d32fb60] [c000000000bceeb4] .mem_cgroup_init+0x128/0x1b0
[c00000003d32fbf0] [c00000000000a2b4] .do_one_initcall+0xd4/0x260
[c00000003d32fce0] [c000000000ba26a8] .kernel_init_freeable+0x244/0x32c
[c00000003d32fdb0] [c00000000000ac24] .kernel_init+0x24/0x140
[c00000003d32fe30] [c000000000009564] .ret_from_kernel_thread+0x58/0x74

bisect log:

[...]

# first bad commit: [a40d0d2cf21e2714e9a6c842085148c938bf36ab] mm: memcontrol: remove unnecessary soft limit tree node test

The change in question is this:

mm: memcontrol: remove unnecessary soft limit tree node test

kzalloc_node() automatically falls back to nodes with suitable memory.

Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxx>
Reviewed-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fb9788af4a3e..10db4a654d68 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4539,13 +4539,10 @@ static void __init mem_cgroup_soft_limit_tree_init(void)
{
struct mem_cgroup_tree_per_node *rtpn;
struct mem_cgroup_tree_per_zone *rtpz;
- int tmp, node, zone;
+ int node, zone;

for_each_node(node) {
- tmp = node;
- if (!node_state(node, N_NORMAL_MEMORY))
- tmp = -1;
- rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, tmp);
+ rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, node);
BUG_ON(!rtpn);

soft_limit_tree.rb_tree_per_node[node] = rtpn;

--

Is the assumption of this patch wrong? Does the specified node have
to be online for the fallback to work?


I added some debugging. First, the problem is only seen with SMP disabled.
Second, there is only one online node.

Without your patch:

Node 0 online 1 high 1 memory 1 cpu 0 normal 1 tmp 0 rtpn c00000003d240600
Node 1 online 0 high 0 memory 0 cpu 0 normal 0 tmp -1 rtpn c00000003d240640
Node 2 online 0 high 0 memory 0 cpu 0 normal 0 tmp -1 rtpn c00000003d240680

[ and so on up to node 255 ]

With your patch:

Node 0 online 1 high 1 memory 1 cpu 0 normal 1 rtpn c00000003d240600
Unable to handle kernel paging request for data at address 0x0000af50
Faulting instruction address: 0xc000000000895a3c
Oops: Kernel access of bad area, sig: 11 [#1]

The log message is after the call to kzalloc_node.

So it doesn't look like the fallback is working, at least not with ppc64
in non-SMP mode.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/