Re: [-mm][PATCH 4/4] Add memrlimit controller accounting andcontrol (v4)
From: Balbir Singh
Date: Thu May 15 2008 - 04:27:08 EST
* Paul Menage <menage@xxxxxxxxxx> [2008-05-15 00:39:45]:
> On Thu, May 15, 2008 at 12:03 AM, Balbir Singh
> <balbir@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > I want to focus on this conclusion/assertion, since it takes care of
> > most of the locking related discussion above, unless I missed
> > something.
> >
> > My concern with using mmap_sem, is that
> >
> > 1. It's highly contended (every page fault, vma change, etc)
>
> But the only *new* cases of taking the mmap_sem that this would
> introduce would be:
>
> - on a failed vm limit charge
Why a failed charge? Aren't we talking of moving all charge/uncharge
under mmap_sem?
> - when a task exit/exec causes an mm ownership change
Yes, in the mm_owner_changed callbacks
> - when a task moves between two cgroups in the memrlimit hierarchy.
>
Yes, this would nest cgroup_mutex and mmap_sem. Not sure if that would
be a bad side-effect.
> All of these should be rare events, so I don't think the additional
> contention is a worry.
We do make several of all charge calls under the mmap_sem, but not
all of them. So the additional contention might not be all that bad.
>
> > 2. It's going to make the locking hierarchy deeper and complex
>
> Yes, potentially. But if the upside of that is that we eliminate a
> lock/unlock on a shared lock on every mmap/munmap call, it might well
> be worth it.
>
> > 3. It's not appropriate to call all the accounting callbacks with
> > the mmap_sem() held, since the undo operations _can get_ complicated
> > at the caller.
> >
>
> Can you give an example?
Some paths of the uncharge are not under mmap_sem. Undoing the
operation there seemed complex.
>
> > I would prefer introducing a new lock, so that other subsystems are
> > not affected.
> >
>
> For getting the first cut of the memrlimit controller working this may
> well make sense. But it would be nice to avoid it longer-term.
OK, so here's what I am going to try and do
Refactor the code to try and use mmap_sem and see what I come up
with. Basically use mmap_sem for all charge/uncharge operations as
well use mmap_sem in read_mode in the move_task() and
mm_owner_changed() callbacks. That should take care of the race
conditions discussed, unless I missed something.
Try and instrument insert_vm_struct() for charge/uncharge
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/