[patch] mm: memcontrol: rewrite uncharge API fix

From: Johannes Weiner
Date: Fri Jun 20 2014 - 11:09:14 EST


It's not entirely clear whether do_swap_account or PCG_MEMSW is the
authoritative answer to whether a page is swap-accounted or not. This
currently leads to the following memsw counter underflow when swap
accounting is disabled:

[ 2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
[ 2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
[ 2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 2.753355] 0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
[ 2.753355] ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
[ 2.753355] 0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
[ 2.753355] Call Trace:
[ 2.753355] [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
[ 2.753355] [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
[ 2.753355] [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
[ 2.753355] [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
[ 2.753355] [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
[ 2.753355] [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
[ 2.753355] [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
[ 2.753355] [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
[ 2.753355] [<ffffffff8115735e>] release_pages+0x1f2/0x20d
[ 2.753355] [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
[ 2.753355] [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
[ 2.753355] [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
[ 2.753355] [<ffffffff811730c1>] unmap_region+0xcd/0xdf
[ 2.753355] [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
[ 2.753355] [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
[ 2.753355] [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
[ 2.753355] [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
[ 2.753355] [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
[ 2.753355] ---[ end trace cfeb07101f6fbdfb ]---

Don't set PCG_MEMSW when swap accounting is disabled, so that
uncharging only has to look at this per-page flag.

mem_cgroup_swapout() could also fully rely on this flag, but as it can
bail out before even looking up the page_cgroup, check do_swap_account
as a performance optimization and only sanity test for PCG_MEMSW.

Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 94d7c40b9f26..d6a20935f9c4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2740,7 +2740,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
* have the page locked
*/
pc->mem_cgroup = memcg;
- pc->flags = PCG_USED | PCG_MEM | PCG_MEMSW;
+ pc->flags = PCG_USED | PCG_MEM | (do_swap_account ? PCG_MEMSW : 0);

if (lrucare) {
if (was_on_lru) {
@@ -6598,7 +6598,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage,
return;

VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM), oldpage);
- VM_BUG_ON_PAGE(!(pc->flags & PCG_MEMSW), oldpage);
+ VM_BUG_ON_PAGE(do_swap_account && !(pc->flags & PCG_MEMSW), oldpage);
pc->flags &= ~(PCG_MEM | PCG_MEMSW);

if (PageTransHuge(oldpage)) {
--
2.0.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/