Re: [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter()

From: Oleg Nesterov
Date: Mon Jul 25 2016 - 13:01:26 EST


Paul, sorry for delay.

On 07/21, Paul E. McKenney wrote:
>
> On Thu, Jul 21, 2016 at 07:34:36PM +0200, Oleg Nesterov wrote:
> > On 07/20, Paul E. McKenney wrote:
> > >
> > > On Wed, Jul 20, 2016 at 07:16:03PM +0200, Oleg Nesterov wrote:
> > >
> > > > Now, suppose we add the additional enter/exit's:
> > > >
> > > > freeze_super(sb)
> > > > {
> > > > // this doesn't block
> > > > __rcu_sync_enter(SEM3);
> > > > __rcu_sync_enter(SEM2);
> > > > __rcu_sync_enter(SEM1);
> > > >
> > > > down_write(&sb->s_umount);
> > > > if (NEED_TO_FREEZE) {
> > > > percpu_down_write(SEM1);
> > >
> > > The above waits for the grace period initiated by __rcu_sync_enter(),
> > > correct? Presumably "yes", because it will invoke rcu_sync_enter(), which
> > > will see the state as GP_ENTER, and will thus wait.
> >
> > But if down_write() blocks and/or NEED_TO_FREEZE takes some time it
> > could already see the GP_PASSED state, or at least it can sleep less.
> >
> > > But your point is that if !NEED_TO_FREEZE, we will get here without
> > > waiting for a grace period.
> > >
> > > But why aren't the __rcu_sync_enter() and rcu_sync_exit() calls inside
> > > the "if" statement?
> >
> > Yes, if we do __rcu_sync_enter() inside "if", then rcu_sync_exit() can't
> > hit GP_ENTER.
> >
> > But why we should disallow this use-case? It does not complicate the code
> > at all.
>
> I do agree that it doesn't complicate the current implementation.
> But it relies on a global lock, so I am not at all confident that this
> implementation is the final word.

Hmm. which global lock? Or did you mean freeze_super(), not rcu_sync?

> And speaking of global locks, failing to discourage the pattern above
> means that the code is unnecessarily acquiring three global locks,
> which doesn't seem like a good thing to me.

Well, I do not agree, but this wasn't written by me. Just in case, all these
locks above are not really global, they are per-sb, but this is minor.

And the patches which changed sb->s_writers to use percpu_rw_semaphore/rcu_sync
didn't change this logic.

Except the old implementation was buggy, and the readers were slower than now.

> I agree that there are use cases for beginning-of-time __rcu_sync_enter()
> or whatever we end up naming it.

OK, at least iiuc you agree that cgroup_init() can use __rcu_sync_enter().
As for other potential use-cases, we will disccuss this later. I will have
to CC you anyway ;)

So I'll send v2 with renames after I test it. Thanks again.

Oleg.