Re: [PATCH 4/8] membarrier: Make the post-switch-mm barrier explicit

From: Peter Zijlstra
Date: Wed Jun 16 2021 - 03:35:49 EST


On Wed, Jun 16, 2021 at 02:19:49PM +1000, Nicholas Piggin wrote:
> Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm:
> > membarrier() needs a barrier after any CPU changes mm. There is currently
> > a comment explaining why this barrier probably exists in all cases. This
> > is very fragile -- any change to the relevant parts of the scheduler
> > might get rid of these barriers, and it's not really clear to me that
> > the barrier actually exists in all necessary cases.
>
> The comments and barriers in the mmdrop() hunks? I don't see what is
> fragile or maybe-buggy about this. The barrier definitely exists.
>
> And any change can change anything, that doesn't make it fragile. My
> lazy tlb refcounting change avoids the mmdrop in some cases, but it
> replaces it with smp_mb for example.

I'm with Nick again, on this. You're adding extra barriers for no
discernible reason, that's not generally encouraged, seeing how extra
barriers is extra slow.

Both mmdrop() itself, as well as the callsite have comments saying how
membarrier relies on the implied barrier, what's fragile about that?