Re: [PATCH mm] fix swapoff breakage; however...

From: Balbir Singh
Date: Mon Sep 17 2007 - 15:13:32 EST


Hugh Dickins wrote:
> rc4-mm1's memory-controller-memory-accounting-v7.patch broke swapoff:
> it extended unuse_pte_range's boolean "found" return code to allow an
> error return too; but ended up returning found (1) as an error.
> Replace that by success (0) before it gets to the upper level.
>
> Signed-off-by: Hugh Dickins <hugh@xxxxxxxxxxx>
> ---
> More fundamentally, it looks like any container brought over its limit in
> unuse_pte will abort swapoff: that doesn't doesn't seem "contained" to me.
> Maybe unuse_pte should just let containers go over their limits without
> error? Or swap should be counted along with RSS? Needs reconsideration.
>
> mm/swapfile.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- 2.6.23-rc4-mm1/mm/swapfile.c 2007-09-07 13:09:42.000000000 +0100
> +++ linux/mm/swapfile.c 2007-09-17 15:14:47.000000000 +0100
> @@ -642,7 +642,7 @@ static int unuse_mm(struct mm_struct *mm
> break;
> }
> up_read(&mm->mmap_sem);
> - return ret;
> + return (ret < 0)? ret: 0;

Thanks, for the catching this. There are three possible solutions

1. Account each RSS page with a probable swap cache page, double
the RSS accounting to ensure that swapoff will not fail.
2. Account for the RSS page just once, do not account swap cache
pages
3. Follow your suggestion and let containers go over their limits
without error

With the current approach, a container over it's limit will not
be able to call swapoff successfully, is that bad?

We plan to implement per container/per cpuset swap in the future.
Given that, isn't this expected functionality. You are over it's
limit cannot really swapoff a swap device. If we allow pages to
be unused, we could end up with a container that could exceed
it's limit by a significant amount by calling swapoff.


> }
>
> /*


--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/