Re: Alternative to signals/sys_membarrier() in liburcu
From: Mathieu Desnoyers
Date: Thu Mar 12 2015 - 18:30:47 EST
----- Original Message -----
> From: "Linus Torvalds" <torvalds@xxxxxxxxxxxxxxxxxxxx>
> To: "Mathieu Desnoyers" <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: "Michael Sullivan" <sully@xxxxxxxxxx>, lttng-dev@xxxxxxxxxxxxxxx, "LKML" <linux-kernel@xxxxxxxxxxxxxxx>, "Paul E.
> McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>, "Ingo Molnar" <mingo@xxxxxxxxxx>,
> "Thomas Gleixner" <tglx@xxxxxxxxxxxxx>, "Steven Rostedt" <rostedt@xxxxxxxxxxx>
> Sent: Thursday, March 12, 2015 5:47:05 PM
> Subject: Re: Alternative to signals/sys_membarrier() in liburcu
> On Thu, Mar 12, 2015 at 1:53 PM, Mathieu Desnoyers
> <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> > So the question as it stands appears to be: would you be comfortable
> > having users abuse mprotect(), relying on its side-effect of issuing
> > a smp_mb() on each targeted CPU for the TLB shootdown, as
> > an effective implementation of process-wide memory barrier ?
> Be *very* careful.
> Just yesterday, in another thread (discussing the auto-numa TLB
> performance regression), we were discussing skipping the TLB
> invalidates entirely if the mprotect relaxes the protections.
> Because if you *used* to be read-only, and them mprotect() something
> so that it is read-write, there really is no need to send a TLB
> invalidate, at least on x86. You can just change the page tables, and
> *if* any entries are stale in the TLB they'll take a microfault on
> access and then just reload the TLB.
> So mprotect() to a more permissive mode is not necessarily serializing.
The idea here is to always mprotect() to a more restrictive mode,
which should trigger the TLB shootdown.
> Also, you need to make sure that your page is actually in memory,
> because otherwise the kernel may end up seeing "oh, it's not even
> present", and never flush the TLB at all.
> So now you need to mlock that page. Which can be problematic for non-root.
I'm aware the default amount of locked memory is usually quite low
(64kB here). So we'd need to handle cases where we run out of locked
memory. We could fallback to a slower userspace RCU scheme if this
> In other words, I'd be a bit leery about it. There may be other
> gotcha's about it.
Looking again at this old proposed patch (https://lkml.org/lkml/2010/4/18/15)
which adds a few memory barriers around updates to mm_cpumask
for sys_membarrier makes me wonder whether mprotect() may not skip
some CPU from the mask that would actually need to be taken care of
in very narrow race scenarios.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/