[patch v2] mm, memcg: periodically schedule when emptying page list

From: David Rientjes
Date: Mon Jun 02 2014 - 20:51:33 EST


From: Hugh Dickins <hughd@xxxxxxxxxx>

mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and
mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none
of which indicate that the iteration may be taking too long, is met.

We have encountered the following stack trace many times indicating
"need_resched set for > 51000020 ns (51 ticks) without schedule", for example:

scheduler_tick()
<timer irq>
mem_cgroup_move_account+0x4d/0x1d5
mem_cgroup_move_parent+0x8d/0x109
mem_cgroup_reparent_charges+0x149/0x2ba
mem_cgroup_css_offline+0xeb/0x11b
cgroup_offline_fn+0x68/0x16b
process_one_work+0x129/0x350

If this iteration is taking too long, we still need to do cond_resched() even
when an individual page is not busy.

[rientjes@xxxxxxxxxx: changelog]
Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
---
v2: always reschedule if needed, "page" itself may not have a pc mismatch
or been unable to isolate.

mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4784,9 +4784,9 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
if (mem_cgroup_move_parent(page, pc, memcg)) {
/* found lock contention or "pc" is obsolete. */
busy = page;
- cond_resched();
} else
busy = NULL;
+ cond_resched();
} while (!list_empty(list));
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/