Re: [PATCH v8 15/18] mm, fs, dax: handle layout changes to pinned dax mappings

From: Paul E. McKenney
Date: Fri Apr 13 2018 - 18:47:27 EST

Next message: Linus Torvalds: "Re: [GIT PULL] auxdisplay for v4.17-rc1"
Previous message: Djalal Harouni: "Re: [PATCH] [RFC][WIP] namespace.c: Allow some unprivileged proc mounts when not fully visible"
In reply to: Dan Williams: "Re: [PATCH v8 15/18] mm, fs, dax: handle layout changes to pinned dax mappings"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Apr 13, 2018 at 03:03:51PM -0700, Dan Williams wrote:
> On Mon, Apr 9, 2018 at 9:51 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> > On Mon, Apr 9, 2018 at 9:49 AM, Jan Kara <jack@xxxxxxx> wrote:
> >> On Sat 07-04-18 12:38:24, Dan Williams wrote:
> > [..]
> >>> I wonder if this can be trivially solved by using srcu. I.e. we don't
> >>> need to wait for a global quiescent state, just a
> >>> get_user_pages_fast() quiescent state. ...or is that an abuse of the
> >>> srcu api?
> >>
> >> Well, I'd rather use the percpu rwsemaphore (linux/percpu-rwsem.h) than
> >> SRCU. It is a more-or-less standard locking mechanism rather than relying
> >> on implementation properties of SRCU which is a data structure protection
> >> method. And the overhead of percpu rwsemaphore for your use case should be
> >> about the same as that of SRCU.
> >
> > I was just about to ask that. Yes, it seems they would share similar
> > properties and it would be better to use the explicit implementation
> > rather than a side effect of srcu.
>
> ...unfortunately:
>
> BUG: sleeping function called from invalid context at
> ./include/linux/percpu-rwsem.h:34
> [..]
> Call Trace:
> dump_stack+0x85/0xcb
> ___might_sleep+0x15b/0x240
> dax_layout_lock+0x18/0x80
> get_user_pages_fast+0xf8/0x140
>
> ...and thinking about it more srcu is a better fit. We don't need the
> 100% exclusion provided by an rwsem we only need the guarantee that
> all cpus that might have been running get_user_pages_fast() have
> finished it at least once.
>
> In my tests synchronize_srcu is a bit slower than unpatched for the
> trivial 100 truncate test, but certainly not the 200x latency you were
> seeing with syncrhonize_rcu.
>
> Elapsed time:
> 0.006149178 unpatched
> 0.009426360 srcu

You might want to try synchronize_srcu_expedited(). Unlike plain RCU,
it does not send IPIs, so should be less controversial. And it might
well more than make up the performance difference you are seeing above.

Thanx, Paul

Next message: Linus Torvalds: "Re: [GIT PULL] auxdisplay for v4.17-rc1"
Previous message: Djalal Harouni: "Re: [PATCH] [RFC][WIP] namespace.c: Allow some unprivileged proc mounts when not fully visible"
In reply to: Dan Williams: "Re: [PATCH v8 15/18] mm, fs, dax: handle layout changes to pinned dax mappings"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]