Re: Subject: [RFC MM] mmap_sem scaling: Use mutex and percpucounter instead

From: Andi Kleen
Date: Tue Nov 10 2009 - 04:20:06 EST


On Tue, Nov 10, 2009 at 03:21:11PM +0900, KOSAKI Motohiro wrote:
> > On Fri, 6 Nov 2009, Andi Kleen wrote:
> >
> > > On Fri, Nov 06, 2009 at 12:08:54PM -0500, Christoph Lameter wrote:
> > > > On Fri, 6 Nov 2009, Andi Kleen wrote:
> > > >
> > > > > Yes but all the major calls still take mmap_sem, which is not ranged.
> > > >
> > > > But exactly that issue is addressed by this patch!
> > >
> > > Major calls = mmap, brk, etc.
> >
> > Those are rare. More frequently are for faults, get_user_pages and
> > the like operations that are frequent.
> >
> > brk depends on process wide settings and has to be
> > serialized using a processor wide locks.
> >
> > mmap and other address space local modification may be able to avoid
> > taking mmap write lock by taking the read lock and then locking the
> > ptls in the page struct relevant to the address space being modified.
> >
> > This is also enabled by this patchset.
>
> Andi, Why do you ignore fork? fork() hold mmap_sem write-side lock and
> it is one of critical path.

I have not seen profile logs where fork was critical. But that's not saying
that it can't be. But fork is so intrusive that locking it fine grained
is probably very hard.

> Plus, most critical mmap_sem issue is not locking cost itself. In stree workload,
> the procss grabbing mmap_sem frequently sleep. and fair rw-semaphoe logic
> frequently prevent reader side locking.
> At least, this improvement doesn't help google like workload.

Not helping is not too bad, the problem I had was just that it makes
writers even slower.

-Andi
--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/